Keywords: Large Language Models, Search Capability Incentivization, Curriculum Learning, Tool Learning
Abstract: Retrieval-Augmented Generation (RAG) is essential for grounding Large Language Models, yet its deployment is hindered by the high latency and financial costs of frequent retrieval. Existing reinforcement learning approaches primarily maximize answer accuracy, inadvertently encouraging excessive search behavior and ignoring the trade-off between performance and efficiency.
To resolve this, we introduce PragmaticSearch, a framework that learns efficient retrieval. Unlike standard RL, PragmaticSearch employs a Two-Stage Advantage Shaping (TSAS) curriculum that explicitly decouples capability learning from cost calibration. We further introduce a gating mechanism, theoretically grounded in a Bayesian estimate of retrieval necessity, that dynamically neutralizes cost penalties when retrieval provides high utility. Optimized via our stabilized LS-GRPO algorithm, this approach prevents policy collapse. Experiments across seven benchmarks show that PragmaticSearch reduces retrieval calls by up to 76.2\% while maintaining competitive performance.
Paper Type: Long
Research Area: AI/LLM Agents
Research Area Keywords: LLM agents
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models
Languages Studied: English
Submission Number: 8895
Loading