No Question, No Passage, No Problem: Investigating Artifact Exploitation and Reasoning in Multiple-Choice Reading Comprehension

Published: 27 Oct 2025, Last Modified: 27 Oct 2025NeurIPS Lock-LLM Workshop 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Models, Dataset Artifacts, Multiple-Choice Reading Comprehension, Machine Reading Comprehension, Multiple-Choice QA, Partial-Input Prompting
TL;DR: We show that large language models consistently surpass majority baselines in multiple-choice reading comprehension even without passages or questions, revealing reasoning strategies beyond simple artifact exploitation.
Abstract: Large language models (LLMs) can achieve above majority baseline performance on NLP tasks even when deprived of parts of the input, raising concerns that benchmarks reward artifacts rather than reasoning. Prior work has demonstrated this phenomenon in multiple-choice QA and natural language inference, but not in multiple-choice reading comprehension (MCRC), where both a passage and question are integral to the task. We study MCRC under a stricter ablation, removing both passage and question to leave only the answer options. Despite this severe ablation, models consistently exceed majority baselines across five benchmarks. To probe how such accuracy arises, we introduce two reasoning-based strategies: Process-of-Elimination, which iteratively discards distractors, and Abductive Passage Inference, which infers a context to justify an option. Both strategies closely track choices-only accuracy, suggesting that strong performance reflects genuine reasoning procedures rather than dataset artifacts alone.
Submission Number: 61
Loading