Improving Biomedical Abstractive Summarisation with Knowledge Aggregation from Citation Papers

Chen Tang; Shun Wang; Tomas Goldsack; Chenghua Lin

Improving Biomedical Abstractive Summarisation with Knowledge Aggregation from Citation Papers

Chen Tang, Shun Wang, Tomas Goldsack, Chenghua Lin

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 MainEveryoneRevisionsBibTeX

Submission Type: Regular Long Paper

Submission Track: Natural Language Generation

Submission Track 2: Summarization

Keywords: Biomedical Text Summarisation, Abstractive Summarisation, Knowledge Aggregation, Citation Graph

TL;DR: We propose a novel dataset and a novel framework to study improving biomedical abstractive summarisation with knowledge aggregation from citation papers

Abstract: Abstracts derived from biomedical literature possess distinct domain-specific characteristics, including specialised writing styles and biomedical terminologies, which necessitate a deep understanding of the related literature. As a result, existing language models struggle to generate technical summaries that are on par with those produced by biomedical experts, given the absence of domain-specific background knowledge. This paper aims to enhance the performance of language models in biomedical abstractive summarisation by aggregating knowledge from external papers cited within the source article. We propose a novel attention-based citation aggregation model that integrates domain-specific knowledge from citation papers, allowing neural networks to generate summaries by leveraging both the paper content and relevant knowledge from citation papers. Furthermore, we construct and release a large-scale biomedical summarisation dataset that serves as a foundation for our research. Extensive experiments demonstrate that our model outperforms state-of-the-art approaches and achieves substantial improvements in abstractive biomedical text summarisation.

Submission Number: 1091

Loading