QE-RAG: A Robust Retrieval-Augmented Generation Benchmark for Query Entry Errors

QE-RAG: A Robust Retrieval-Augmented Generation Benchmark for Query Entry Errors

ACL ARR 2025 May Submission751 Authors

15 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Current benchmarks evaluate the performance of RAG methods from various perspectives, they share a common assumption that user queries used for retrieval are error-free. However, in real-world interactions between users and LLMs, query entry errors are frequent. The impact of these errors on current RAG methods against such errors remains largely unexplored. To bridge this gap, we propose QE-RAG, the first robust RAG benchmark designed specifically to evaluate performance against query entry errors. We analyze the impact of these errors on LLM outputs and find that corrupted queries degrade model performance, which can be mitigated through query correction and training a robust retriever for retrieving relevant documents. Based on these insights, we propose a contrastive learning-based robust retriever training method and a retrieval-augmented query correction method. Extensive in-domain and cross-domain experiments reveal that: (1) state-of-the-art RAG methods including sequential, branching, and iterative methods, exhibit poor robustness to query entry errors; (2) our method significantly enhances the robustness of RAG when handling query entry errors and it's compatible with existing RAG methods, further improving their robustness.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: benchmarking

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data resources, Data analysis

Languages Studied: English

Keywords: benchmark, retrieval-augmented generation, query entry errors

Submission Number: 751

Loading