Semantic Contribution-Aware Adaptive Retrieval for Black-Box Models

Semantic Contribution-Aware Adaptive Retrieval for Black-Box Models

ACL ARR 2025 February Submission3374 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Retrieval-Augmented Generation (RAG) plays a critical role in mitigating hallucinations and improving factual accuracy for Large Language Models (LLMs). While dynamic retrieval techniques aim to determine retrieval timing and content based on model intrinsic needs, existing approaches struggle to generalize effectively in black-box model scenarios. To address this limitation, we propose the Semantic Contribution-Aware Adaptive Retrieval (SCAAR) framework. SCAAR iteratively leverages the semantic importance of words in upcoming sentences to dynamically adjust retrieval thresholds and filter information, retaining the top-P\% most semantically significant words for constructing retrieval queries. We comprehensively evaluate SCAAR against baseline methods across four long-form, knowledge-intensive generation datasets using three different models. Extensive experiments also analyze the impact of various hyperparameters within the framework. Our results demonstrate SCAAR's superior or competitive performance across all tasks, showcasing its ability to effectively detect model retrieval needs and construct efficient retrieval queries that help models find relevant knowledge for problem-solving in black-box scenarios. Code is released in our Github repository.

Paper Type: Long

Research Area: Dialogue and Interactive Systems

Research Area Keywords: retrieval augmented generation, language models, black-box models

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 3374

Loading