ST-SQL: Semi-Supervised Self-Training for Text-to-SQL via Column Specificity Meta-Learning

Anonymous

ST-SQL: Semi-Supervised Self-Training for Text-to-SQL via Column Specificity Meta-Learning

Anonymous

16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone

Abstract: The few-shot problem is an urgent challenge for the generalization capability of the single-table text-to-SQL task. Current few-shot methods neglect the potential information of unlabeled data and have a domain bias due to the same weight of samples. Motivated by this, this paper proposes a Self-Training text-to-SQL (ST-SQL) method which handles the problem from both views of data and algorithms. At the data level, ST-SQL performs data expansion by using an iterative framework to attach pseudo-labels to unlabeled data. The expanded data are sampled to reversely train the model. At the algorithm level, ST-SQL defines a column specificity to perform a more fine-grained gradient update during meta-training. The common samples are attached more weight to eliminate the domain bias. ST-SQL achieves state-of-the-art results on both open-domain and domain-specific benchmarks and brings more significant improvements on few-shot tests.

0 Replies

Loading