Provence: efficient and robust context pruning for retrieval-augmented generation

Nadezhda Chirkova; Thibault Formal; Vassilina Nikoulina; Stéphane CLINCHANT

Provence: efficient and robust context pruning for retrieval-augmented generation

Nadezhda Chirkova, Thibault Formal, Vassilina Nikoulina, Stéphane CLINCHANT

Published: 22 Jan 2025, Last Modified: 18 May 2025ICLR 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: retrieval-augmented generation, context pruning, question answering

TL;DR: We propose Provence, an efficient and robust context pruner for question answering, which dynamically detects the amount of irrelevant information in contexts and prunes it out with negligible to no drop in performance

Abstract: Retrieval-Augmented Generation improves various aspects of large language models (LLMs) generation, but suffers from computational overhead caused by long contexts, and the propagation of irrelevant retrieved information into generated responses. Context pruning deals with both aspects, by removing irrelevant parts of retrieved contexts before LLM generation. Existing context pruning approaches are limited, and do not present a universal model that would be both _efficient_ and _robust_ in a wide range of scenarios, e.g., when contexts contain a variable amount of relevant information or vary in length, or when evaluated on various domains. In this work, we close this gap and introduce Provence (Pruning and Reranking Of retrieVEd relevaNt ContExts), an efficient and robust context pruner for Question Answering, which dynamically detects the needed amount of pruning for a given context and can be used out-of-the-box for various domains. The three key ingredients of Provence are formulating the context pruning task as sequence labeling, unifying context pruning capabilities with context reranking, and training on diverse data. Our experimental results show that Provence enables context pruning with negligible to no drop in performance, in various domains and settings, at almost no cost in a standard RAG pipeline. We also conduct a deeper analysis alongside various ablations to provide insights into training context pruners for future work.

Primary Area: applications to computer vision, audio, language, and other modalities

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 10706

Loading