Counterfactual Voting Adjustment for Quality Assessment and Fairer Voting in Online Platforms with Helpfulness Evaluation

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Efficient access to high-quality information is vital for online platforms. To promote more useful information, users not only create new content but also evaluate existing content, often through helpfulness voting. Although aggregated votes help service providers rank their user content, these votes are often biased by disparate accessibility per position and the cascaded influence of prior votes. For a fairer assessment of information quality, we propose the Counterfactual Voting Adjustment (CVA), a causal framework that accounts for the context in which individual votes are cast. Through preliminary and semi-synthetic experiments, we show that CVA effectively models the position and herding biases, accurately recovering the predefined content quality. In a real experiment, we demonstrate that reranking content based on the learned quality by CVA exhibits stronger alignment with both user sentiment and quality evaluation assessed by GPT-4o, outperforming system rankings based on aggregated votes and model-based rerankings without causal inference. Beyond the individual quality inference, our embeddings offer comparative insights into the behavioral dynamics of expert user groups across 120 major StackExchange communities.
Lay Summary: Online information such as product reviews and Q\&A content is highly valuable, and efficient access to high-quality information benefits users and platforms. The helpfulness voting (upvote/downvote) feature has been widely adopted to address challenges like information overload, diversity, and noise. However, helpfulness votes are often biased due to social influences, such as prior votes and displayed ranking. This work investigates how these biases affect voting behavior and proposes a framework that integrates causal inference with a behavioral model to learn fairer assessments of information quality using only observational data. Our results show that reranking content based on the proposed model better aligns with true quality proxies like comment sentiment and GPT-4o evaluations, outperforming rankings based on raw vote scores or models lacking causal adjustments. With our fairer estimated content quality, platforms can apply better rankings and strengthen their content as a valuable knowledge asset. For users, improved rankings reduce cognitive effort in finding useful information and support fair recognition of their under-appreciated contributions. Additionally, by quantifying the biases, our model offers insights into the behavioral patterns of different StackExchange communities, enabling platforms to understand and address community-specific voting dynamics without costly interventions or human moderator hiring.
Primary Area: Applications->Social Sciences
Keywords: helpfulness voting, question answers, position bias, herding bias, online evaluation, causal effect
Submission Number: 10840
Loading