Rad-Flamingo: A Multimodal Prompt driven Radiology Report Generation Framework with Patient-Centric Explanations

Rad-Flamingo: A Multimodal Prompt driven Radiology Report Generation Framework with Patient-Centric Explanations

ACL ARR 2025 May Submission5678 Authors

20 May 2025 (modified: 29 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: In modern healthcare, radiology plays a pivotal role in diagnosing and managing diseases. However, the complexity of medical imaging data and the variability in interpretation can lead to inconsistencies and a lack of patient-centered insight in radiology reports. To address this challenge, a novel multimodal prompt-driven report generation framework Rad-Flamingo was developed, that integrates diverse data modalities—such as medical images, and clinical notes—to produce comprehensive and context-aware radiology reports. Our framework leverages innovative prompt engineering techniques to guide vision-language models in generating relevant information, ensuring these generated reports are not only accurate but also understandable to individual patients. A key feature of our framework is its ability to provide patient-centric explanations, offering clear and personalized insights into diagnostic findings and their implications. Additionally, we also demonstrate a synthetic data generation pipeline, to append any existing benchmark datasets' findings and impressions with patient-centric explanation. Experimental results demonstrate that this framework's effectiveness in enhancing report quality, improving understandability, and could foster better patient-doctor communication. This approach represents a significant step towards human-centered medical AI systems.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: Natural Language Explanations, Multimodality, Medical NLP

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources

Languages Studied: English

Keywords: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources

Submission Number: 5678

Loading