ASPIRO: Any-shot Structured Parsing-error-Induced ReprOmpting for Consistent Data-to-Text Generation

Martin Vejvar; Yasutaka Fujimoto

ASPIRO: Any-shot Structured Parsing-error-Induced ReprOmpting for Consistent Data-to-Text Generation

Martin Vejvar, Yasutaka Fujimoto

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 FindingsEveryoneRevisionsBibTeX

Submission Type: Regular Short Paper

Submission Track: Efficient Methods for NLP

Submission Track 2: Human-Centered NLP

Keywords: Large Language Models, Data-to-Text, data disambiguation, structured data verbalisation, few-shot learning, multi-shot re-prompting

TL;DR: ASPIRO leverages LLMs and rule-based parsing to generate concise verbalisation templates from single-triple data entries, reducing parsing errors by 66% against 0-shot setting and performing competitively on standard automatic metrics.

Abstract: We present ASPIRO, an approach for structured data verbalisation into short template sentences in zero to few-shot settings. Unlike previous methods, our approach prompts Large Language Models (LLMs) to directly produce entity-agnostic templates, rather than relying on LLMs to faithfully copy the given example entities, or validating/crafting the templates manually. We incorporate LLM re-prompting, triggered by algorithmic parsing checks, as well as the PARENT metric induced consistency validation to identify and rectify template generation problems in real-time. ASPIRO, compared to direct LLM output, averages 66% parsing error rate reduction in generated verbalisations of RDF triples on the DART dataset. Our best 5-shot text-davinci-003 setup, scoring BLEU of 50.62, METEOR of 45.16, BLEURT of 0.82, NUBIA of 0.87, and PARENT of 0.8962 on the Rel2Text dataset, competes effectively with recent fine-tuned pretrained language models.

Submission Number: 3767

Loading