A Survey on Neural Data-to-Text Generation

Yupian Lin; Tong Ruan; Jingping Liu; Haofen Wang

A Survey on Neural Data-to-Text Generation

Yupian Lin, Tong Ruan, Jingping Liu, Haofen Wang

Published: 01 Jan 2024, Last Modified: 29 Apr 2025IEEE Trans. Knowl. Data Eng. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Data-to-text Generation (D2T) aims to generate textual natural language statements that can fluently and precisely describe the structured data such as graphs, tables, and meaning representations (MRs) in the form of key-value pairs. It is a typical and crucial task in natural language generation (NLG). Early D2T systems generated texts with the cost of human engineering in designing domain specific rules and templates, and achieved acceptable performance in coherence, fluency, and fidelity. In recent years, the data-driven D2T systems based on deep learning have reached state-of-the-art (SOTA) performance in more challenging datasets. In this paper, we provide a comprehensive review on existing neural data-to-text generation approaches. We first introduce available D2T resources, including systematically categorized D2T datasets and mainstream evaluation metrics. Next, we survey existing works based on the taxonomy along two axes: neural end-to-end D2T and neural modular D2T. We also discuss the potential applications and the adverse impacts. Finally, we present readers with the challenges faced by neural D2T and outline some potential future directions in this area.

Loading