FB-RAG: Improving RAG with Forward and Backward Lookup

FB-RAG: Improving RAG with Forward and Backward Lookup

ACL ARR 2025 July Submission221 Authors

25 Jul 2025 (modified: 21 Aug 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Traditional Retrieval-Augmented Generation (RAG) struggles with complex queries that lack strong signals to retrieve the most relevant context, forcing a trade-off between choosing a small context that misses key information and a large context that confuses the LLM. To address this, we propose Forward-Backward RAG (FB-RAG), a new training-free framework based on a simple yet powerful forward-looking strategy. FB-RAG employs a light-weight LLM to peek into potential future generations, using evidence from multiple sampled outputs to precisely identify the most relevant context for a final, more powerful generator. This improves performance without complex finetuning or Reinforcement Learning common in prior work. Across $9$ datasets from LongBench and $\infty$Bench, FB-RAG consistently delivers strong results. Further, the performance gains can be achieved with reduced latency due to a shorter, more focused prompt for the powerful generator. On EN.QA dataset, FB-RAG matches the leading baseline with over $48$\% latency reduction or achieves an $8$\% performance improvement with a $10$\% latency reduction. Our analysis finds cases where even when the forward-looking LLM fails to generate correct answers, its attempts are sufficient to guide the final model to an accurate response, demonstrating how smaller LLMs can systematically improve the performance and efficiency of larger ones.

Paper Type: Long

Research Area: Generation

Research Area Keywords: retrieval-augmented generation, automatic evaluation, text-to-text generation, inference methods

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 221

Loading