Keywords: Passage reranking, Retrieval-Augmented Generation
TL;DR: We propose a new reranking method to mitigates the challenges of cross-passage inference.
Abstract: Retrieval-Augmented Generation (RAG) systems rely on retrieving relevant evidence from a corpus to support downstream generation.
The common practice of splitting a long document into multiple shorter passages enables finer-grained and targeted information retrieval.
However, it also introduces challenges when a correct retrieval would require inference across passages, such as resolving coreference, disambiguating entities, and aggregating evidence scattered across multiple sources.
Many state-of-the-art (SOTA) reranking methods, despite utilizing powerful large pretrained language models with potentially high inference costs, still neglect the aforementioned challenges.
Therefore, we propose Embedding-Based Context-Aware Reranker (EBCAR), a lightweight reranking framework operating directly on embeddings of retrieved passages with enhanced cross-passage understandings through the structural information of the passages and a hybrid attention mechanism, which captures both high-level interactions across documents and low-level relationships within each document.
We evaluate EBCAR against SOTA rerankers on the ConTEB benchmark, demonstrating its effectiveness for information retrieval requiring cross-passage inference and its advantages in both accuracy and efficiency.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 1756
Loading