PureCover: Bridging the Gap in Re-ranking for Retrieval-Augmented Generation via Balancing Coverage and Noise

PureCover: Bridging the Gap in Re-ranking for Retrieval-Augmented Generation via Balancing Coverage and Noise

ICLR 2026 Conference Submission16803 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Retrieval-Augmented Generation, Re-ranking, Multi-objective Optimization

Abstract: Re-ranking, originating from Information Retrieval (IR), has become a critical technique for filtering retrieved documents in Retrieval-Augmented Generation (RAG). Current RAG systems often directly apply re-rankers from traditional IR, which were originally designed to provide relevant and diverse documents to human users. However, this adoption overlooks a fundamental gap: unlike humans can use selective attention to filter noise and focus on key evidence, LLMs lack this ability. This gap causes traditional re-rankers to fail in covering essential evidence and minimizing noise for LLMs, significantly hurting RAG performance, especially in complex question-answering tasks. To address this, we argue that RAG re-rankers should serve a distinct objective: not only ensuring the coverage of key information but also minimizing noise in the selected document set. To achieve this objective, we propose PureCover, a document selection framework tailored for RAG. Instead of relying on traditional Top-K re-ranking, we reformulate the document selection process as a multi-objective optimization problem and solve it by exploiting LLM attention patterns during goal-oriented reasoning. To improve efficiency, we distill the selection capability into an LLM selector via a set-wise strategy. Experiments on four multi-hop QA benchmarks demonstrate that PureCover consistently outperforms state-of-the-art baselines, achieving a better balance between coverage and noise for RAG.

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 16803

Loading