Exploiting Reversible Semantic Parsing and Text Generation for Error Correction with Pre-trained LLMs

ACL ARR 2024 June Submission1456 Authors

14 Jun 2024 (modified: 04 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Semantic parsing and text generation are reversible processes when working with Discourse Representation Structures (DRS). Obviously, errors can arise in both the parsing (text-to-DRS) and generation (DRS-to-text). This paper presents an approach that exploits the reversible nature of these tasks to automatically correct such errors without additional model training. We leverage pre-trained large language models (LLMs) in two pipeline setups: Pars-Gen-Pars and Gen-Pars-Gen, where the output of one model serves as the input to the next. In the Pars-Gen-Pars pipeline, input text is parsed into a DRS, then used to generate text, which is finally parsed again. Conversely, the Gen-Pars-Gen pipeline starts with a DRS, generates text, parses it, and regenerates text from the parsed DRS. Interestingly, by propagating the data through these reversible pipelines, errors from the initial parse or generation step can be mitigated, instead of being amplified. Experiments on the Parallel Meaning Bank dataset demonstrate the efficacy of our approach, with improved performance over baseline models on semantic parsing (SMATCH) and text generation (BLEU, METEOR, COMET, chrF, BERT-Score) metrics. Our error analysis also sheds light on the types of mistakes addressed by each pipeline setup. The proposed method offers a simple yet effective way to enhance DRS-based natural language processing without costly model retraining.
Paper Type: Long
Research Area: Efficient/Low-Resource Methods for NLP
Research Area Keywords: semantic parsing, text generation, discourse representation structures, error mitigation, pipeline approach, pre-trained LLMs
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models
Languages Studied: English
Submission Number: 1456
Loading