ECoRAG: Evidentiality-guided Compression for Long Context RAG

ACL ARR 2024 December Submission1174 Authors

15 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large Language Models (LLMs) have shown remarkable performance in Open-Domain Question Answering (ODQA) by leveraging external documents through Retrieval-Augmented Generation (RAG). To reduce RAG overhead, from longer context, context compression is necessary. However, previous compression methods do not focus on filtering out non-evidential information, which limit the performance in LLM-based RAG. We thus propose Evidentiality-guided RAG, or \textbf{ECoRAG} framework. ECoRAG improves LLM performance by compressing retrieved documents with a focus on evidentiality, ensuring whether answer generation is supported by the correct evidence. As additional step, ECoRAG reflects whether the compressed content provides sufficient evidence, and if not, retrieve more until sufficient. Experiments show that ECoRAG improves LLM performance on ODQA tasks, outperforming existing compression methods. Furthermore, ECoRAG is highly cost-efficient, as it not only reduces the total RAG latency but also minimizes token usage by retaining only the necessary information to generate the correct answer. The code is publicly available for further exploration.\footnote{https://anonymous.4open.science/r/ecorag-54BF}
Paper Type: Long
Research Area: Question Answering
Research Area Keywords: Efficient/Low-Resource Methods for NLP,Question Answering
Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency
Languages Studied: English
Submission Number: 1174
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview