Improving Demonstration Diversity by Human-Free Fusing for Text-to-SQL

ACL ARR 2024 June Submission2492 Authors

15 Jun 2024 (modified: 10 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: In-context learning with large language models (LLMs) is the current mainstream method for text-to-SQL. Previous studies have explored selecting relevant demonstrations from a human-labeled demonstration pool, but these methods lack diversity and incur high labeling costs. In this work, we address measuring and enhancing the diversity of the text-to-SQL demonstration pool. First, we introduce a diversity metric and present that the diversity of the existing labeling data can be further enhanced. Motivated by these findings, we propose **Fused** that iteratively fuses demonstrations to create a diverse demonstration pool based on human labeling or even from scratch with LLMs, reducing labeling costs. **Fused** achieves an average improvement of 3.2% based on existing labeling and 5.0% from scratch on several mainstream datasets, demonstrating its effectiveness.
Paper Type: Long
Research Area: Semantics: Lexical and Sentence-Level
Research Area Keywords: Syntax: Parsing, NLP Applications
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 2492
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview