PragmaticSearch: Learning to Efficient Retrieval via Advantage Shaping

PragmaticSearch: Learning to Efficient Retrieval via Advantage Shaping

ACL ARR 2026 January Submission8895 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models, Search Capability Incentivization, Curriculum Learning, Tool Learning

Abstract: Retrieval-Augmented Generation (RAG) is essential for grounding Large Language Models, yet its deployment is hindered by the high latency and financial costs of frequent retrieval. Existing reinforcement learning approaches primarily maximize answer accuracy, inadvertently encouraging excessive search behavior and ignoring the trade-off between performance and efficiency. To resolve this, we introduce PragmaticSearch, a framework that learns efficient retrieval. Unlike standard RL, PragmaticSearch employs a Two-Stage Advantage Shaping (TSAS) curriculum that explicitly decouples capability learning from cost calibration. We further introduce a gating mechanism, theoretically grounded in a Bayesian estimate of retrieval necessity, that dynamically neutralizes cost penalties when retrieval provides high utility. Optimized via our stabilized LS-GRPO algorithm, this approach prevents policy collapse. Experiments across seven benchmarks show that PragmaticSearch reduces retrieval calls by up to 76.2\% while maintaining competitive performance.

Paper Type: Long

Research Area: AI/LLM Agents

Research Area Keywords: LLM agents

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models

Languages Studied: English

Submission Number: 8895

Loading