MARS-SQL: A Multi-Agent Reinforcement Learning Framework for Text-to-SQL

Haolin Yang; Jipeng Zhang; Zhitao He; Yi R. Fung

MARS-SQL: A Multi-Agent Reinforcement Learning Framework for Text-to-SQL

Haolin Yang, Jipeng Zhang, Zhitao He, Yi R. Fung

16 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Text2SQL, LLM, Reinforcement Learning, Multi-Agent

Abstract: Translating natural language to SQL remains a significant challenge for complex queries requiring environmental interaction and self-correction. To address this, we introduce MARS-SQL, a novel multi-agent framework that combines principled task decomposition and interactive reinforcement learning (RL). Our system comprises three specialized agents: a Grounding Agent for schema linking, a Generation Agent for query generation, and a Validation Agent for final selection. The core of our framework is the Generator agent, which is trained via a multi-turn RL policy. Adopting a ReAct-style Think-Act-Observe loop, the agent iteratively generates thoughts, executes SQL actions against a live database, and revises its strategy based on execution feedback, enabling dynamic, stateful reasoning and self-correction. At inference time, we generate multiple interaction trajectories to explore diverse reasoning paths. The Verifier agent, then selects the optimal trajectory by modeling verification as a next-token prediction task and choosing the solution with the highest generation probability. This structured workflow, which pipelines specialized agents and combines interactive RL for generation with generative modeling for verification, proves highly effective for robust and accurate SQL generation. Experiments show that **MARS-SQL** achieves state-of-the-art Execution Accuracy of 77.84\% on the BIRD dev set and 89.75\% on the Spider test set.

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 7852

Loading