Hi-MrGn: Hierarchical Medical Report Generation Network

Hi-MrGn: Hierarchical Medical Report Generation Network

ACL ARR 2025 May Submission4914 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Numerous deep learning (DL)-based approaches have been developed for medical report generation (MRG), aiming to automate the description of medical images. These reports typically comprise two sections: the findings, which describe visual aspects of the images, and the impression, which summarizes the diagnosis or assessment. Given the distinct abstraction levels of these sections, conventional end-to-end DL methods that generate both simultaneously may not be optimal. Addressing this challenge, we introduce a novel Hierarchical Medical Report Generation Network (Hi-MrGn) designed to better reflect the inherent structure of medical reports. The Hi-MrGn operates in two stages: initially, it generates the findings from input multimodal data including medical images and auxiliary diagnostic texts; subsequently, it produces the impression based on both the findings and images. To enhance the semantic coherence between findings and impression, we incorporate a contrastive learning module within the Hi-MrGn. We validate our approach using two public X-ray image datasets, MIMIC-CXR and IU-Xray, demonstrating that our method surpasses current state-of-the-art (SOTA) techniques in this domain.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: data-to-text generation,multimodal applications,healthcare applications

Contribution Types: Theory

Languages Studied: English

Keywords: medical report generation, multimodal fusion, hierarchical structure

Submission Number: 4914

Loading