Medical Scientific Table-to-Text Generation with Synthetic Data under Data Sparsity Constraint

Heng-Yi Wu; Jingqing Zhang; Julia Ive; Tong Li; Vibhor Gupta; Bingyuan Chen; Yike Guo

Medical Scientific Table-to-Text Generation with Synthetic Data under Data Sparsity Constraint

Heng-Yi Wu, Jingqing Zhang, Julia Ive, Tong Li, Vibhor Gupta, Bingyuan Chen, Yike Guo

03 Oct 2022 (modified: 05 May 2023)Neurips 2022 SyntheticData4MLReaders: Everyone

Keywords: Natural Language Generation, Data-to-Text Generation, Synthetic Data Generation

Abstract: An efficient table-to-text summarization system can drastically reduce manual efforts to understand and summarise tabular data into textual reports. However, in practice, the problem is heavily impeded by data sparsity and the inability of the state-of-the-art natural language generation models (such as T5, PEGASUS, and GPT-Neo) to produce coherent and accurate outputs. This is particularly true in pre-clinical and clinical domains. In this paper, we propose a novel table-to-text approach and tackle these problems with the help of synthetic data generation as well as copy mechanism. Experiments show that the proposed method can boost the performance of copying concise and relevant information from tabular data to generate assay validation and toxicology reports.

Supplementary Material: zip

4 Replies

Loading