RAGGED: Towards Informed Design of Retrieval Augmented Generation Systems

Jennifer Hsia; Afreen Shaikh; Zhiruo Wang; Graham Neubig

RAGGED: Towards Informed Design of Retrieval Augmented Generation Systems

Jennifer Hsia, Afreen Shaikh, Zhiruo Wang, Graham Neubig

Published: 10 Oct 2024, Last Modified: 19 Nov 2024AFM 2024 OralEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Retrieval Augmented Generation, Language Modeling, Question Answering

Abstract: Retrieval-augmented generation (RAG) systems have shown promise in improving task performance by leveraging external context, but realizing their full potential depends on careful configuration. In this paper, we investigate how the choice of retriever and reader models, context length, and context quality impact RAG performance across different task types. Our findings reveal that while some readers consistently benefit from additional context, others degrade when exposed to irrelevant information, highlighting the need for tuning based on reader sensitivity to noise. Moreover, retriever improvements do not always translate into proportional gains in reader results, particularly in open-domain questions. However, in specialized tasks, even small improvements in retrieval can significantly boost reader results. These insights underscore the importance of optimizing RAG systems by aligning configurations with task complexity and domain-specific needs.

Submission Number: 20

Loading