Integrating Ontology-Based Knowledge to Improve Biomedical Multi-Document Summarization Model

Quoc-An Nguyen, Khanh-Vinh Nguyen, Hoang-Quynh Le, Duy-Cat Can, Tam Doan Thanh, Trung-Hieu Do, Mai-Vu Tran

Published: 01 Jan 2023, Last Modified: 21 Feb 2025ACIIDS (2) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Most existing extractive summarization models use the original text’s internal information and calculate each sentence’s importance individually. When applied to specific domains (such as verbal text, biomedical literature, etc.), these models have some drawbacks: the variety of synonym terms, unknown words or terminologies, and the intra-document and inter-document relations between sentences or terms. In this work, we proposed an ontology-based summarization model that leverages many knowledge bases to understand the input documents. Our proposed model was built with an integrated ontology and a signal transmission-based method for extending domain knowledge such as related terms, and relationships between terms and sentences. The proposed model has been proven effective with the highest ROUGE-2 F1 score in the test dataset of the MEDIQA 2021 MAS shared tasks.