MCTS-SQL: A Practical Framework for Text-to-SQL with Monte Carlo Tree Search

ACL ARR 2025 May Submission1770 Authors

18 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Text-to-SQL is a fundamental yet challenging task in the NLP area, aiming at translating natural language questions into SQL queries. While recent advances in large language models have greatly improved performance, most existing approaches depend on models with tens of billions of parameters or costly APIs (e.g. ChatGPT or Gemini), limiting their applicability in resource-constrained real-world environments. Therefore, enabling the light-weight models for Text-to-SQL is of great practical significance. However, smaller LLMs often struggle with complex user intent understanding, schema linking and syntax correctness. To address these challenges, we propose MCTS-SQL, a novel framework that uses Monte Carlo Tree Search (MCTS) to guide SQL generation through multi-step refinement. Since the light-weight models' weak performance of single-shot prediction, we generate better results through several trials with feedback. From another perspective, this mechanism also improve the model's reasoning ability to solve harder examples. Moreover, to further filter irrelevant information of databases, we designed an additional schema selector. Experiments results on the SPIDER and BIRD benchmarks demonstrate the effectiveness of our approach. Using a small open-source Qwen2.5-Coder-Instruct-1.5B, our method outperforms ChatGPT-3.5. And when we use GPT-4o as the base model, our method achieves a new SOTA execution accuracy 69.40% on BIRD. Notably, our method achieves a significant performance improvement(51.48%) on the more challenging subset, outperforming the previous SOTA by 3.41\%.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: NLP Application,Efficient/Low-Resource Methods for NLP
Contribution Types: NLP engineering experiment
Languages Studied: English
Keywords: Text-to-sql, MCTS, LLM
Submission Number: 1770
Loading