SE-Search: Self-Evolving Search Agent via Memory and Dense Reward

SE-Search: Self-Evolving Search Agent via Memory and Dense Reward

ACL ARR 2026 January Submission363 Authors

22 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Question Answer, Search Agent, Reinforcement Learning

Abstract: Retrieval augmented generation (RAG) reduces hallucinations and factual errors in large language models (LLMs) by conditioning generation on retrieved external knowledge. Recent search agents further cast RAG as an autonomous, multi-turn information-seeking process. However, existing methods often accumulate irrelevant or noisy documents and rely on sparse reinforcement learning signals. We propose Self-Evolving Search, a Self-Evolving Search agent that improves online search behavior through three components, memory purification, atomic query training, and dense rewards. SE-Search follows a Think-Search-Memorize strategy that retains salient evidence while filtering irrelevant content. Atomic query training promotes shorter and more diverse queries, improving evidence acquisition. Dense rewards provide fine-grained feedback that speeds training. Experiments on single-hop and multi-hop question answering benchmarks show that SE-Search-3B outperforms strong baselines, yielding a 10.8 point absolute improvement and a 33.8% relative gain over Search-R1.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: NLP Applications, Question Answering, Dialogue and Interactive Systems

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models

Languages Studied: English

Submission Number: 363

Loading