Listening to the Wise Few: Select-and-Copy Attention Heads for Multiple-Choice QA

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: large language models (LLMs), attention mechanisms, model interpretability, zero-shot learning
TL;DR: We enhance LLM performance on multiple-choice question answering by leveraging query-key interactions in specific "select-and-copy" attention heads, achieving up to 16% accuracy improvements and providing deeper insights into LLM decision process.
Abstract: Multiple-choice question answering (MCQA) is one of the most widely adopted methods for evaluating large language models (LLMs). In this approach, the model is presented with a question and a set of possible answers, and the answer with the highest logit is selected as the model's prediction. However, this evaluation format has limitations, as even if the model knows the correct answer, it may struggle to select the corresponding option simply due to difficulties in following this rigid format. Methods such as instruction tuning or in-context learning help alleviate this issue but introduce their own biases, such as dependence on the order and semantics of training examples. In this paper, we address this issue by conducting an intrinsic investigation of the LLM’s decision-making process when answering multiple-choice questions. Specifically, we identify and study specific select-and-copy heads responsible for choosing the correct answer. We develop new scores to reveal the underlying knowledge from these heads: the Query-Key Score, which measures the interaction between query and key representations in the selected head, and the Attention Score, which is based on the attention weights. By studying these scores, we found that the most pronounced select-and-copy heads are consistent across four popular Multi-Choice Question Answering (MCQA) datasets. Moreover, our scores enable better knowledge extraction, achieving up to a 16% gain for LLaMA2-7B and up to 10% for larger models on these benchmarks. On a synthetic dataset, where the correct answer is known explicitly, accuracy increases by nearly 60%, confirming the method's effectiveness in overcoming MCQA format limitations. To support our claims, we conduct experiments on models ranging from 1.5 billion to 70 billion parameters, in both zero-shot and few-shot settings.
Supplementary Material: zip
Primary Area: foundation or frontier models, including LLMs
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 11753
Loading