Large Language Models Systematically Favor Popular Options: Evidence and Mitigation Across Multiple Choice Tasks

Abdelrahman Abdallah; Jamshid Mozafari; Bhawna Piryani; Mohammed Ali; Adam Jatowt

Large Language Models Systematically Favor Popular Options: Evidence and Mitigation Across Multiple Choice Tasks

Abdelrahman Abdallah, Jamshid Mozafari, Bhawna Piryani, Mohammed Ali, Adam Jatowt

19 Sept 2025 (modified: 12 Feb 2026)ICLR 2026 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: large language models, multiple-choice question answering, popularity bias, debiasing

Abstract: Multiple–choice questions (MCQs) are widely used for benchmarking large language models (LLMs). We show that modern LLMs systematically favor popular distractors over less–popular correct options. We introduce PopMCQ, a strategy technique of six stress/control manipulations for MCQs that alter option popularity while keeping the gold label fixed. We apply these strategies to the PlausibleQA evaluation built from NQ, TriviaQA, MuSiQue, and QASC, and quantify bias via the Spearman rank correlation between correctness and \emph{relative} popularity surplus. We then introduce PopDebias, an inference–time correction that removes a label-free popularity prior and requires no LLM fine-tuning (with an optional lightweight calibration step). When averaged across all datasets and strategies, PopDebias improves the accuracy of all 23 models evaluated. This finding holds true at the individual dataset level as well, with the method boosting accuracy for at least 20 of 23 models on every dataset we tested (NQ: 23/23, QASC: 22/23, MuSiQue: 22/23, and TriviaQA: 20/23), demonstrating broad effectiveness.

Primary Area: generative models

Submission Number: 20290

Loading