DTS-SQL: Decomposed Text-to-SQL with Small Large Language Models

Published: 01 Jan 2024, Last Modified: 25 Jan 2025EMNLP (Findings) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Leading models for the text-to-SQL task heavily rely on proprietary Large Language Models (LLMs), posing concerns over data privacy. Closing the performance gap between small open-source models and large proprietary models is crucial to mitigate this reliance. To this end, we introduce a novel two-stage fine-tuning approach that decomposes the task into two simpler tasks. Through comprehensive evaluation on three large cross-domain datasets and two small LLMs, we show that this approach improves execution accuracy by 3 to 7 percent, effectively aligning the performance of open-source models with their proprietary counterparts. Our proposed method has achieved 60.31% execution accuracy on Bird hold-out test set, which is the highest performance among methods using 7B parameter models.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview