Rad-Flamingo: A Multimodal Prompt driven Radiology Report Generation Framework with Patient-Centric Explanations

ACL ARR 2024 December Submission970 Authors

15 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: In modern healthcare, radiology plays a pivotal role in diagnosing and managing diseases. However, the complexity of medical imaging data and the variability in interpretation can lead to inconsistencies and a lack of patient-centered insights in radiology reports. To address these challenges, we propose a novel multimodal prompt-driven report generation framework that integrates diverse data modalities—such as medical images, and clinical notes—to produce comprehensive and context-aware radiology reports. Our framework leverages innovative prompt engineering techniques to guide vision-language models in synthesizing relevant information, ensuring the generated reports are not only accurate but also tailored to individual patient profiles. A key feature of our framework is its ability to provide patient-centric explanations, offering clear and personalized insights into diagnostic findings and their implications. Experimental results demonstrate this framework's effectiveness in enhancing report quality, improving understandability, and could foster better patient-doctor communication. This approach represents a significant step toward more intelligent, transparent, and human-centered medical AI systems..
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: Natural Language Explanations, Multimodality, Medical NLP
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources
Languages Studied: English
Submission Number: 970
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview