Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Output Prefilling

Silvia Cappelletti; Tobia Poppi; Samuele Poppi; Zheng Xin Yong; Diego Garcia-Olano; Marcella Cornia; Lorenzo Baraldi; Rita Cucchiara

Improving LLM First-Token Predictions in Multiple-Choice Question Answering via Output Prefilling

Silvia Cappelletti, Tobia Poppi, Samuele Poppi, Zheng Xin Yong, Diego Garcia-Olano, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara

16 Sept 2025 (modified: 20 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multiple-Choice Question Answering, First-Token Probability, Prefilling Attack, Interpretability, Large Language Models

Abstract: Large Language Models (LLMs) are traditionally evaluated on multiple-choice question answering (MCQA) tasks using *First-Token Probability* (FTP), which selects the answer option whose initial token has the highest likelihood. While efficient, FTP can be fragile: models may assign high probability to unrelated tokens (*misalignment*) or use a valid token merely as part of a generic preamble rather than as a clear answer choice (*misinterpretation*), undermining the reliability of symbolic evaluation. We propose a simple solution: *output prefilling*, a structured natural-language prefix (e.g., "The correct option is:") prepended to the model output. Originally explored in AI safety as an attack strategy, we repurpose prefilling to steer the model to respond with a clean, valid option, without modifying its parameters. Through extensive evaluation, we find that the FTP with prefilling strategy substantially improves accuracy, calibration, and output consistency across a broad set of LLMs and MCQA benchmarks. It outperforms standard FTP and often matches the performance of open-ended generation approaches that require full decoding and external classifiers, while being significantly more efficient. Our analysis suggests that prefilling is a simple, robust, and zero-cost method to enhance the reliability of FTP-based evaluation in multiple-choice settings.

Primary Area: generative models

Submission Number: 6943

Loading