An open chest X-ray dataset with benchmarks for automatic radiology report generation in French

Published: 01 Jan 2024, Last Modified: 28 Sept 2024Neurocomputing 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Medical report generation (MRG), which aims to automatically generate a textual description of a specific medical image (e.g., a chest X-ray), has recently received increasing research interest. Building on the success of image captioning, MRG has become achievable. However, generating language-specific radiology reports poses a challenge for data-driven models due to their reliance on paired image-report chest X-ray datasets, which are labor-intensive, time-consuming, and costly. In this paper, we introduce a chest X-ray benchmark dataset, namely CASIA-CXR, consisting of high-resolution chest radiographs accompanied by narrative reports originally written in French. To the best of our knowledge, this is the first public chest radiograph dataset with medical reports in this particular language. Importantly, we propose a simple yet effective multimodal encoder–decoder contextually-guided framework for medical report generation in French. We validated our framework through intra-language and cross-language contextual analysis, supplemented by expert evaluation performed by radiologists. The dataset is freely available at: https://www.casia-cxr.net/.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview