Rad-Flamingo: A Multimodal Prompt driven Radiology Report Generation Framework with Patient-Centric Explanations
Abstract: In modern healthcare, radiology plays a piv-
otal role in diagnosing and managing diseases.
However, the complexity of medical imaging
data and the variability in interpretation can
lead to inconsistencies and a lack of patient-
centered insights in radiology reports. To ad-
dress these challenges, we propose a novel mul-
timodal prompt-driven report generation frame-
work Rad-Flamingo, that integrates diverse
data modalities—such as medical images, and
clinical notes—to produce comprehensive and
context-aware radiology reports. Our frame-
work leverages innovative prompt engineering
techniques to guide vision-language models in
synthesizing relevant information, ensuring the
generated reports are not only accurate but also
understandable to individual patients. A key
feature of our framework is its ability to pro-
vide patient-centric explanations, offering clear
and personalized insights into diagnostic find-
ings and their implications. Experimental re-
sults demonstrate this framework’s effective-
ness in enhancing report quality, improving un-
derstandability, and could foster better patient-
doctor communication. This approach repre-
sents a significant step towards more intelligent,
transparent, and human-centered medical AI
systems.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: Natural Language Explanations, Multimodality, Medical NLP
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources
Languages Studied: English
Submission Number: 4416
Loading