RSL-SQL: Mitigating the Risk of Schema Linking in Text-to-SQL Generation

RSL-SQL: Mitigating the Risk of Schema Linking in Text-to-SQL Generation

ACL ARR 2026 January Submission8967 Authors

06 Jan 2026 (modified: 07 Jun 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Text-to-SQL, Large Language Models, Risk Hedging, Bidirectional Schema Linking

Abstract: Text-to-SQL generation aims to translate natural language questions into SQL statements. In Text-to-SQL based on large language models, schema linking is a widely adopted strategy to streamline the input for LLMs by selecting only relevant schema elements, therefore reducing noise and computational overhead. However, schema linking faces risks that require caution, including the potential omission of necessary elements and disruption of the structural integrity of database. To address these challenges, we propose a novel framework called RSL-SQL that combines bidirectional schema linking, contextual information augmentation, and risk hedging selection strategy. On the BIRD dataset, we use both forward and backward pruning methods to improve the recall rate of schema linking, achieving a strict recall rate of 94\%, while reducing the number of input columns by 83\%. Furthermore, it hedges the risk by voting between a full mode and a simplified mode enhanced with contextual information. Experiments of our approach are comparable to state-of-the-art accuracy on different benchmarks. Furthermore, our approach outperforms a series of GPT-4 based Text-to-SQL systems when adopting DeepSeek-V2 (much cheaper) with the same intact prompts. Extensive analysis and ablation studies confirm the effectiveness of each component in our framework.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: semantic parsing, code generation, LLM Efficiency, prompting, table QA

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 8967

Loading