Two Heads are Better than One: Retrieval Augmented LLM for Question Answering with External Knowledge Attention

Yuanhe Tian; Fei Xia; Yan Song

Two Heads are Better than One: Retrieval Augmented LLM for Question Answering with External Knowledge Attention

Yuanhe Tian, Fei Xia, Yan Song

28 Sept 2024 (modified: 18 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: question answering, large language modeling, retrieval augmented generation, knowledge

Abstract: Retrieval-augmented generation (RAG) of large language models (LLMs) has recently attracted significant attention owing to their ability to address knowledge gaps in generating reliable answers for specific questions. Existing RAG approaches typically optimize the knowledge processing by filtering out irrelevant or incorrect information and restructuring it for model input, improving the accuracy of answers to given questions. A general approach in doing so is to combine the retrieved knowledge with the input inquiry, which are then fed into the LLM to produce an answer. This approach requires the LLM to have strong knowledge comprehension and reasoning capabilities to effectively utilize the useful information, which may lead to errors when it fails to correctly interpret relevant knowledge. In this paper, we propose a novel approach to augmenting LLMs with external knowledge attention for question answering (QA), where the attention is functionalized as an extra head that integrated with the internal heads used in LLMs. We develop a memory-based mechanism that dynamically controls the degree of knowledge integration with the extra head based on the relationship between the question and the retrieved knowledge, and allows for differentiated fusion of external knowledge and LLM ability at its different layers. Experiments on both general and specific-domain QA tasks demonstrate the effectiveness of our approach, highlighting its potential to optimize LLMs for similar challenges.

Primary Area: foundation or frontier models, including LLMs

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 13840

Loading