Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQL

Anonymous

Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQL

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone

Abstract: Currently, the in-context learning method based on large language models (LLMs) has become the mainstream of text-to-SQL research. Previous works have discussed how to select demonstrations related to the user question from a human-labeled demonstration pool. However, human labeling suffers from the limitations of insufficient diversity and high labeling overhead. Therefore, in this paper, we discuss how to measure and improve the diversity of the demonstrations for text-to-SQL. We present a metric to measure the diversity of the demonstrations and analyze the insufficient of the existing labeled data by experiments. Based on the above discovery, we propose fusing iteratively for demonstrations (Fused) to build a high-diversity demonstration pool through human-free multiple-iteration synthesis, improving diversity and lowering label cost. Our method achieves an average improvement of 3.2% and 5.0% with and without human labeling on several mainstream datasets, which proves the effectiveness of Fused.

Paper Type: long

Research Area: Semantics: Sentence-level Semantics, Textual Inference and Other areas

Contribution Types: NLP engineering experiment

Languages Studied: English

0 Replies

Loading