Annotation-Free Reinforcement Learning Query Rewriting via Verifiable Search Reward

Annotation-Free Reinforcement Learning Query Rewriting via Verifiable Search Reward

ACL ARR 2026 January Submission245 Authors

22 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: query rewriting, query rewriter, reinforcement learning, RAG, agent

Abstract: Optimizing queries for Retrieval-Augmented Generation (RAG) systems poses a significant challenge, particularly across diverse modal indices. We introduce RL-QR, a novel annotation-free reinforcement learning framework for query rewriting that eliminates the need for costly human-annotated data. By leveraging verifiable search rewards derived from index-aligned synthetic queries, RL-QR overcomes human-annotation dependencies, extending its applicability to various modalities and index domains. Experimental results demonstrate the framework's robustness, achieving substantial retrieval performance gains of up to 3.9$\times$ on lexical retrievers and 3.5$\times$ on semantic retrievers on the MTEB VIDORE V2 benchmark for unstructured visual documents, along with consistent 5\% to 10\% improvements on MS MARCO v2.1 and internal industrial datasets.

Paper Type: Long

Research Area: Retrieval-Augmented Language Models

Research Area Keywords: Generation, Efficient/Low-Resource Methods for NLP, Dialogue and Interactive Systems, Information Retrieval and Text Mining, Multimodality and Language Grounding to Vision, Robotics and Beyond,

Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency

Languages Studied: English, Korean

Submission Number: 245

Loading