Sticking to the Facts: Confident Decoding for Faithful Data-to-Text Generation

Ran Tian; Shashi Narayan; Thibault Sellam; Ankur P. Parikh

Sticking to the Facts: Confident Decoding for Faithful Data-to-Text Generation

Ran Tian, Shashi Narayan, Thibault Sellam, Ankur P. Parikh

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Keywords: Natural Language Processing, Text Generation, Data-to-Text Generation, Hallucination, Calibration, Variational Bayes

TL;DR: We propose a confidence-oriented decoder to reduce hallucination in neural structured-data-to-text generation.

Abstract: Neural conditional text generation systems have achieved significant progress in recent years, showing the ability to produce highly fluent text. However, the inherent lack of controllability in these systems allows them to hallucinate factually incorrect phrases that are unfaithful to the source, making them often unsuitable for many real world systems that require high degrees of precision. In this work, we propose a novel confidence oriented decoder that assigns a confidence score to each target position. This score is learned in training using a variational Bayes objective, and can be leveraged at inference time using a calibration technique to promote more faithful generation. Experiments on a structured data-to-text dataset -- WikiBio -- show that our approach is more faithful to the source than existing state-of-the-art approaches, according to both automatic metrics and human evaluation.

Original Pdf: pdf

14 Replies

Loading