Abstractive Multi-document Summarization with Cross-Documents Discourse Relations

Published: 01 Jan 2023, Last Modified: 16 Apr 2025ICONIP (11) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Generating a summary from a set of documents remains a challenging task. Abstractive multi-document summarization (MDS) methods have shown remarkable advantages when compared with extractive MDS. They can express the original document information in new sentences with higher continuity and readability. However, mainstream abstractive models, which are pre-trained on sentence pairs rather than entire documents, often fail to effectively capture long-range dependencies throughout the document. To address these issues, we propose a novel abstractive MDS model that aims to succinctly inject semantic and structural information of elementary discourse units into the model to improve its generative ability. In particular, we first extract semantic features by splitting the single document into discourses and building the discourse tree. Then, we design discourse Patterns to convert the raw document text and trees into a linearized format while guaranteeing corresponding relationships. Finally, we employ an abstractive model to generate target summaries with the processed input sequence and to learn the discourse semantic information. Extensive experiments show that our model outperforms current mainstream MDS methods in the ROUGE evaluation. This indicates the superiority of our proposed model and the capacity of the abstractive model with the hybrid pattern.
Loading