Contrastive Hierarchical Discourse Graph for Vietnamese Extractive Multi-Document Summarization

Published: 01 Jan 2023, Last Modified: 21 Feb 2025IALP 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Extractive Multi-Document Summarization (EMDS) plays a pivotal role in distilling information from multiple sources, enabling efficient knowledge synthesis and document retrieval. However, achieving high-quality EMDS, particularly in languages with unique linguistic characteristics such as Vietnamese, remains a challenge. In this paper, we adapt the Contrastive Hierarchical Discourse Graph (CHDG), a novel approach designed to address these challenges. CHDG operates at multiple levels, including sentence, section, document, and cluster of documents, capturing intricate discourse relationships and global thematic coherence. We employ a contrastive learning framework to enhance sentence representations, enabling CHDG to select coherent and contextually relevant sentences for the final summary. We evaluate CHDG on a benchmark Vietnamese news dataset, showcasing its superior performance in terms of ROUGE scores and human evaluation. Our results demonstrate the potential of CHDG to advance the state-of-the-art in Vietnamese EMDS, contributing to more effective information condensation and knowledge synthesis in this critical domain.
Loading