When are radiology reports useful for training medical image classifiers?

09 Oct 2025 (modified: 11 Oct 2025)EurIPS 2025 Workshop MedEurIPS SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: vision language models, vlm, medical image classification, distillation, privileged information, x-ray, radiograph, radiology, pre-training, image-text alignment, prognosis, diagnosis
TL;DR: We explore when we benefit from including radiology reports during pre-training and fine-tuning of medical image-only classifiers.
Abstract: When exploring how radiology reports can be leveraged *during training*, prior works are limited to evaluating pre-trained image representations by fine-tuning to predict diagnostic labels, often extracted from reports, ignoring tasks with labels that are weakly associated with the text. To address this gap, we conduct a systematic study of how radiology reports can be used during both pre-training and fine-tuning, across diagnostic and prognostic tasks, and under varying training set sizes. Our findings reveal that: (1) Leveraging reports during pre-training is beneficial for downstream classification tasks where the label is well-represented in the text; however, image-text alignment can be detrimental in non-diagnostic settings where it's not; (2) Fine-tuning with reports can lead to significant improvements.
Submission Number: 12
Loading