Rad-Flamingo: A Multimodal Prompt driven Radiology Report Generation Framework with Patient-Centric Explanations

ACL ARR 2025 February Submission4416 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: In modern healthcare, radiology plays a piv- otal role in diagnosing and managing diseases. However, the complexity of medical imaging data and the variability in interpretation can lead to inconsistencies and a lack of patient- centered insights in radiology reports. To ad- dress these challenges, we propose a novel mul- timodal prompt-driven report generation frame- work Rad-Flamingo, that integrates diverse data modalities—such as medical images, and clinical notes—to produce comprehensive and context-aware radiology reports. Our frame- work leverages innovative prompt engineering techniques to guide vision-language models in synthesizing relevant information, ensuring the generated reports are not only accurate but also understandable to individual patients. A key feature of our framework is its ability to pro- vide patient-centric explanations, offering clear and personalized insights into diagnostic find- ings and their implications. Experimental re- sults demonstrate this framework’s effective- ness in enhancing report quality, improving un- derstandability, and could foster better patient- doctor communication. This approach repre- sents a significant step towards more intelligent, transparent, and human-centered medical AI systems.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: Natural Language Explanations, Multimodality, Medical NLP
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources
Languages Studied: English
Submission Number: 4416
Loading