RAGGED: Towards Informed Design of Retrieval Augmented Generation Systems

Jennifer Hsia; Afreen Shaikh; Zhiruo Wang; Graham Neubig

RAGGED: Towards Informed Design of Retrieval Augmented Generation Systems

Jennifer Hsia, Afreen Shaikh, Zhiruo Wang, Graham Neubig

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Retrieval Augmented Generation, Language Modeling, Question Answering

Abstract: Retrieval-augmented generation (RAG) systems have shown promise in improving task performance by leveraging external context, but realizing their full potential depends on careful configuration. In this paper, we investigate how the choice of retriever and reader models, context length, and context quality impact RAG per- formance across different task types. Our findings reveal that while some readers consistently benefit from additional context, others degrade when exposed to irrele- vant information, highlighting the need for tuning based on reader sensitivity to noise. Moreover, retriever improvements do not always translate into proportional gains in reader results, particularly in open-domain questions. However, in spe- cialized tasks, even small improvements in retrieval can significantly boost reader results. These insights underscore the importance of optimizing RAG systems by aligning configurations with task complexity and domain-specific needs.

Primary Area: foundation or frontier models, including LLMs

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 11603

Loading