S2G-RAG: Structured Sufficiency and Gap Judging for Iterative Retrieval-Augmented QA

S2G-RAG: Structured Sufficiency and Gap Judging for Iterative Retrieval-Augmented QA

ACL ARR 2026 January Submission5554 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Retrieval Augmented Generation, Question Answering

Abstract: Retrieval-Augmented Generation (RAG) grounds language models in external evidence, but multi-hop question answering remains difficult because iterative pipelines must control what to retrieve next and when the available evidence is adequate. In practice, systems may answer from incomplete evidence chains, or they may accumulate redundant or distractor-heavy text that interferes with later retrieval and reasoning. We propose S2G-RAG (Structured Sufficiency and Gap-judging RAG), an iterative framework with an explicit controller, S2G-Judge. At each turn, S2G-Judge predicts whether the current evidence memory supports answering and, if not, outputs structured gap items that describe the missing information. We map these gap items into the next retrieval query, producing stable multi-turn retrieval trajectories. To reduce noise accumulation, we maintain a sentence-level Evidence Context by extracting a compact set of relevant sentences from retrieved documents. Experiments on TriviaQA, HotpotQA, and 2WikiMultiHopQA show that S2G-RAG improves multi-hop QA performance and robustness under multi-turn retrieval. Furthermore, S2G-RAG can be integrated into existing RAG pipelines with a lightweight component, without modifying the search engine or retraining the generator.

Paper Type: Long

Research Area: Retrieval-Augmented Language Models

Research Area Keywords: multihop QA, retrieval-augmented generation

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 5554

Loading