Evaluating factual accuracy in complex data-to-text

Craig Thomson, Ehud Reiter, Barkavi Sundararajan

Published: 2023, Last Modified: 22 May 2025Comput. Speech Lang. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•Factual accuracy problems limit the usefulness of neural solutions for complex data-to-text.•Existing evaluation methods miss many of these errors, such as hallucination.•We propose and evaluate a gold standard protocol for detecting factual errors in generated text.•We show how this gold standard can be used to measure the efficacy of other methods.•We also explore the common types of error in both human-authored and neural data-to-text systems.