Medical Concept Guided Frequency Representation for Chest X-ray Report Generation

Published: 2026, Last Modified: 15 Jan 2026Cogn. Comput. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Automatic chest X-ray report generation aims to alleviate radiologists’s workload by generating diagnostic narratives directly from medical images. Despite the progress driven by image captioning techniques, existing models often struggle to emulate the cognitive process of radiologists in identifying subtle pathological cues and establishing coherent visual-linguistic associations. The challenge lies in the limited capacity to perceive in the spatial domain and the semantic gap between visual representations and language generation. These cognitive mismatches frequently lead to inaccurate or incomplete reports. We propose a novel method named Medical Concept guided Frequency Representation (MCFeR), which leverages frequency-domain features to enhance the subtle visual details often overlooked in the spatial domain. MCFeR assigns medical concept tags to different frequency regions of the chest X-ray image, allowing the model to mine discriminative features guided by medical knowledge. Furthermore, to bridge the cross-modal semantic gap, MCFeR establishes associations between medical concepts and visual features by aligning concept tags with corresponding visual regions, thus promoting more coherent and semantically accurate report generation. Extensive experiments on two public datasets, IU-Xray and MIMIC-CXR, demonstrate that our method achieves competitive or superior performance compared to state-of-the-art approaches across various evaluation metrics. In addition, a human evaluation conducted by professional radiologists further confirms the clinical validity and readability of the reports generated by MCFeR. The experimental results show that our medical concept guided frequency representation method can effectively capture subtle visual differences in chest X-ray images and mitigate the semantic gap between visual and textual modalities. The experiments on IU-Xray and MIMIC-CXR datasets, as well as human evaluation by radiologists, illustrate the effectiveness of our method in improving the accuracy and clinical relevance of automatic chest X-ray report generation.
Loading