SQL-Checker: Error Detection and Labeling for Text-to-SQL with Interpretability Analysis

Published: 11 Apr 2026, Last Modified: 06 May 2026https://dl.acm.org/doi/abs/10.1145/3774904.3792212EveryoneCC BY 4.0
Abstract: Text-to-SQL technology converts natural language queries into SQL statements for database retrieval. Recent advances in large language models (LLMs) have improved Text-to-SQL performance, but generated SQL often contains semantic or syntax errors that degrade user experience and system stability. Existing SQL error detection methods are costly, lack interpretability, and do not support error labeling. To overcome these issues, we propose SQL-Checker a specialized model for Text-to-SQL error detection. We first analyze common error factors in Text-to-SQL, and we design a novel data synthesis framework based on these error factors. This framework simulates error factors to construct a basic error SQL data, and then using an error analysis template to distill high-quality SQL error analysis data from large-scale models. For complex errors, a self-guided iterative distillation strategy further enhances data quality. SQL-Checker is then trained on this distilled dataset. Additionally, we refine SQL error labeling and, integrate error label recognition into the detection task, enabling macro-level cause analysis. Experiments show SQL-Checker achieves state-of-the-art results on multiple error detection datasets. Incorporating SQL-Checker into the Text-to-SQL pipeline also improves execution accuracy.
Loading