GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing

Tao Yu; Chien-Sheng Wu; Xi Victoria Lin; bailin wang; Yi Chern Tan; Xinyi Yang; Dragomir Radev; richard socher; Caiming Xiong

GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing

Tao Yu, Chien-Sheng Wu, Xi Victoria Lin, bailin wang, Yi Chern Tan, Xinyi Yang, Dragomir Radev, richard socher, Caiming Xiong

Published: 12 Jan 2021, Last Modified: 03 Apr 2024ICLR 2021 PosterReaders: Everyone

Keywords: text-to-sql, semantic parsing, pre-training, nlp

Abstract: We present GraPPa, an effective pre-training approach for table semantic parsing that learns a compositional inductive bias in the joint representations of textual and tabular data. We construct synthetic question-SQL pairs over high-quality tables via a synchronous context-free grammar (SCFG). We pre-train our model on the synthetic data to inject important structural properties commonly found in semantic parsing into the pre-training language model. To maintain the model's ability to represent real-world data, we also include masked language modeling (MLM) on several existing table-related datasets to regularize our pre-training process. Our proposed pre-training strategy is much data-efficient. When incorporated with strong base semantic parsers, GraPPa achieves new state-of-the-art results on four popular fully supervised and weakly supervised table semantic parsing tasks.

One-sentence Summary: Language model pre-training for table semantic parsing.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Data: [Spider-Realistic](https://paperswithcode.com/dataset/spider-realistic)

10 Replies

Loading