Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive‑$k$

Efficient Context Selection for Long-Context QA: No Tuning, No Iteration, Just Adaptive‑$k$

ACL ARR 2025 May Submission5263 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Retrieval-augmented generation (RAG) and long-context language models (LCLMs) both address context limitations of LLMs in open-domain QA. However, optimal external context to retrieve remains an open problem: fixed retrieval budgets risk wasting tokens or omitting key evidence. Existing adaptive methods like Self-RAG and Self-Route rely on iterative LLM prompting and perform well on factoid QA, but struggle with aggregation QA where optimal context size is unknown and variable. We present Adaptive‑k retrieval, a simple and effective single-pass method that selects a query-specific number of passages by applying a threshold to the similarity scores between the query and candidate passages. It does not require model fine-tuning, extra LLM calls or changes to existing retriever–reader pipelines. On both factoid and aggregation QA benchmarks, Adaptive‑k matches or outperforms fixed‑k baselines while using up to 10x fewer tokens than full-context input, and still retrieves 70\% of relevant passages. It improves accuracy across five LCLMs and two embedding models, highlighting that dynamically adjusting context size leads to more efficient and accurate QA.

Paper Type: Long

Research Area: Question Answering

Research Area Keywords: retrieval-augmented generation, LLM efficiency, dense retrieval

Contribution Types: Approaches low compute settings-efficiency

Languages Studied: English

Submission Number: 5263

Loading