Revisiting and Improving Generic Compositional Generalization of LLMs for Semantic Parsing in the Minimum Coverage Scenario

Revisiting and Improving Generic Compositional Generalization of LLMs for Semantic Parsing in the Minimum Coverage Scenario

ACL ARR 2025 February Submission388 Authors

07 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Compositional generalization is one of the important abilities that large language models (LLMs) need to have on semantic parsing tasks. Previous research typically relies on task-specific designs or a large number of samples in demonstrations to improve the compositional generalization of LLMs on semantic parsing. We revisit this issue and find that when the number of samples in a demonstration is limited to a theoretical lower bound for achieving compositional generalization (minimum coverage scenario), current advanced LLMs cannot arbitrarily achieve good compositional generalization generically on different semantic parsing tasks without task-specific designs. We propose Multi-level Component Composition (MC$^2$), a task-independent framework based on input primitives, which aims to generically help LLMs achieve compositional generalization in the minimum coverage scenario by selecting and organizing samples from multiple compositional levels that satisfy the primitive coverage. Experiments and analysis show that MC$^2$ can effectively improve compositional generalization of LLMs on different semantic parsing tasks in the minimum coverage scenario.

Paper Type: Long

Research Area: Syntax: Tagging, Chunking and Parsing

Research Area Keywords: compositionality, semantic parsing

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 388

Loading