Keywords: Retrieval-Augmented Generation(RAG), Question-Answering(QA), Performance Bottleneck, Large Language Model (LLM)
Abstract: This paper aims to investigate a fundamental question in LLM-based RAG (Retrieval-augmented Generation): what is the key bottleneck limiting the performance improvement of current RAG systems. This paper thereby proposes a decoupled perspective to separately analyze the potentials in retrieval and generation stages. Specifically, we design a simple method to approximating the effects of the oracle metric in retrieval stage and the oracle way to utilizing the retrieved documents in generation stage in RAG. On six classic question-answering benchmark tasks, by comparing the performance of standard RAG and its oracle variants, we observe several valuable findings: First, even with the oracle retrieval, the improvement they bring to RAG performance is not as significant as expected. Second, figuring out how to enable generation models to make good use of the retrieved documents holds greater potential for boosting RAG.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 18235
Loading