Keywords: Multimodal Learning, Vision-Language Models, Chest X-ray, Automated Report Generation, Longitudinal Analysis, Radiology
TL;DR: SuSufDoctor is a multimodal chest X-ray report generation system that integrates current and prior images with patient data, improving report accuracy and speed for resource-limited settings.
Abstract: Chest X-ray interpretation is a critical diagnostic task, yet radiologists in low-resource
settings often face high workloads and long reporting times due to severe workforce
shortages. Current automated report generation systems primarily rely on single-image
analysis and cannot incorporate longitudinal comparisons or patient metadata, limiting their
clinical usefulness. This project addresses these gaps by developing SuSufDoctor, an
intelligent, multimodal chest X-ray report generation system powered by a fine-tuned
SmolVLM-500M vision language transformer. The system integrates current and prior chest
X-ray images, radiology reports, and patient metadata to produce comprehensive, structured
reports covering Findings and Impression sections. A longitudinal multimodal dataset was
constructed from the CheXpert-Plus dataset to train and evaluate the model effectively.
Through LoRA-based fine-tuning and quantization techniques, the model achieved
significant performance improvements, with BLEU increasing from 1.29% to 61.53%,
ROUGE-L from 8.26% to 66.08%, and BERTScore (F1) from 80.48% to 93.92%. The
system was deployed as a web application, enabling real-time inference and practical
integration into radiologist workflows. Overall, the results demonstrate that SuSufDoctor can
enhance diagnostic support, reduce reporting burden, and improve access to quality radiology
services in resource-limited environments such as Ghana and Rwanda, where there are about
150 and 10 radiologists, respectively.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 11
Loading