OpenRAG: Optimizing RAG End-to-End via In-Context Retrieval Learning
Keywords: Retrieval-augmented generation, RAG
TL;DR: An end-to-end optimized RAG framework that tunes a retriever to effectively fetch data tailored to specific task requirements.
Abstract: Retrieval-augmented generation (RAG) serves as a bridge connecting large language models (LLMs) to downstream data sources. Despite their widespread adoption, existing RAG frameworks typically use off-the-shelf retrievers with large language models (LLMs) without joint training. In this paper, we analyze and empirically show that the relevance learned for traditional information retrieval scenarios may not consistently apply to RAG scenarios. To bridge this gap, we introduce OpenRAG, a RAG framework that is optimized end-to-end by tuning the retriever to capture in-context, open-ended relevance, enabling adaptation to the diverse and evolving needs. Extensive experiments across a wide range of tasks demonstrate that OpenRAG, by tuning a retriever end-to-end, leads to a consistent improvement of 4.0% over the original retriever, consistently outperforming existing state-of-the-art retrievers by 2.1%. Additionally, our results show that for certain tasks, a 0.2B retriever tuned end-to-end can achieve improvements surpassing those of RAG-oriented or instruction-tuned 8B LLMs, underscoring the cost-effectiveness of our approach for improving RAG systems.
Submission Number: 21
Loading