Keywords: Retrieval-augmented Generation; Large Language Model; Information Retrieval
TL;DR: A fully LLM-based RAG with two-stage fine-tuning to address feature locality and invariance problems
Abstract: Retrieval-augmented generation (RAG) has shown its impressive capability of providing reliable answer predictions and addressing severe hallucination problems. A typical RAG implementation adopts powerful retrieval models to extract external information and leverage large language models (LLMs) to generate corresponding answers. Different with that, recent LLM-based retrieval has raised much attention because it brings substantial improvements in information retrieval (IR) via LLMs’ vigorous semantic understanding capability. However, directly applying LLM to RAG systems remains certain challenges. This may cause feature locality problems since massive parametric knowledge impedes the effective usage of the global information among all corpus, \eg a LLM-based retriever usually inputs the summary of documents instead of the whole documents. Moreover, various tasks pre-trained in LLMs induce severe variance, which further weakens its performance as the retriever.
To address these issues, we propose a novel two-stage fine-tuning architecture called Invar-RAG. In the retrieval stage, a LLM-based retriever is constructed by integrating a LoRA-based representation learning to address the feature locality problem. To justify and consolidate this retrieval’s performance, two patterns (\ie invariant and variant patterns) and an invariance loss are also developed to alleviate the variance in LLM. Moreover, in the generation stage, a meticulously designed fine-tuning method is devised to improve our LLM for accurate answer generation based on the retrieved information. Experimental results demonstrate that Invar-RAG significantly outperforms existing baselines across three Open-domain Question Answering (ODQA) datasets. The code is available in \textbf{Supplementary Material} to ease reproducibility.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9373
Loading