Keywords: Natural Language Generation, Data-to-Text Generation, Synthetic Data Generation
Abstract: An efficient table-to-text summarization system can drastically reduce manual efforts to understand and summarise tabular data into textual reports. However, in practice, the problem is heavily impeded by data sparsity and the inability of the state-of-the-art natural language generation models (such as T5, PEGASUS, and GPT-Neo) to produce coherent and accurate outputs. This is particularly true in pre-clinical and clinical domains. In this paper, we propose a novel table-to-text approach and tackle these problems with the help of synthetic data generation as well as copy mechanism. Experiments show that the proposed method can boost the performance of copying concise and relevant information from tabular data to generate assay validation and toxicology reports.
Supplementary Material: zip