EVO-RAG: Evolving Retrieval-Augmented Agents for Efficient Multi-Hop Query Optimization

Yuelyu Ji; Daqing He; Rui Meng; Zhuochun Li

EVO-RAG: Evolving Retrieval-Augmented Agents for Efficient Multi-Hop Query Optimization

Yuelyu Ji, Daqing He, Rui Meng, Zhuochun Li

08 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Retrieval-Augument Generation, Reinforcement learning

Abstract: Retrieval-augmented generation (RAG) grounds large language models (LLMs) in external evidence, yet multi-hop pipelines still suffer from redundant sub-queries, shallow exploration, and premature or delayed stopping. We present EVO-RAG, a phase-aware framework that couples a lightweight two-stage curriculum Discovery Refinement with seven step-level rewards and an in-episode time scheduler. The scheduler decays exploration incentives as evidence accumulates while increasing efficiency and correctness pressure as uncertainty shrinks. Beyond scalar rewards, we train a multi-head preference model and benchmark DPO, PPO, and GRPO under identical rollouts and curricula for a controlled comparison. Evaluated on HotpotQA, 2WikiMultiHopQA, MuSiQue, and Bamboogle with 8B-class backbones, EVO-RAG improves EM/F1 while reducing redundant hops. Ablations show that (i) suppressing query overlap, (ii) rewarding \emph{controlled backtracking} and \emph{justified refusal}, and (iii) time-dynamic weighting are key to the accuracy--efficiency trade-off.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 3179

Loading