Contrastive concept-phrase pre-training for generating clinically accurate and interpretable chest X-ray reports

Abdallah Tubaishat, Tehseen Zia, David Windridge, Muhammad Nawaz, Muhammad Saad Razzaq

Published: 2025, Last Modified: 05 May 2026Neural Comput. Appl. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Automated radiology report generation is an emerging field for improving patient care and alleviating radiologist workload. However, existing methods face a range of challenges such as limited data availability, clinical metric performance, and interpretability. To address these issues, we propose a contrastive concept-phrase pre-training (C2P2) method, which utilizes a phrase-concept grounding task for contrastive learning. C2P2 learns the correspondence between phrases in a report and image concepts by using a phrase classification task to train a multi-label classifier for X-rays and extracting visual concepts of phrases using class activation maps. We then fine-tune a pre-trained BERT model to translate the extracted phrases into reports. Our proposed method outperforms or matches the previous state of the art in clinical efficacy metrics on both internal and external datasets. Moreover, C2P2 leverages more vision language data for pre-training and provides visual explanations of generated phrases.

External IDs:dblp:journals/nca/TubaishatZWNR25