Keywords: Radiology report generation (RRG), low-resource settings, multi-expert image fusion, error-feedback augmentation
Abstract: Radiology report generation (RRG) is critical for assisting clinical diagnosis, yet current methods struggle to effectively integrate multi-view images and longitudinal patient data while operating under constrained annotation resources. Existing approaches often rely on large-scale supervised datasets and lack adaptability in low-resource settings. To address these challenges, we propose FEFA, a novel approach that combines a multi-expert image fusion module with an error-feedback augmentation strategy powered by large language models. Our fusion module dynamically combines current and prior images using tailored attention and gating mechanisms, producing compact and informative representations. Furthermore, the error-feedback mechanism enables self-correction during training by incorporating error analyses from previous stages. Experiments on MIMIC-CXR show that FEFA achieves state-of-the-art performance with only 8% of the supervised data, attaining 94% of the clinical efficacy score of the best existing method while outperforming all other competitors. Our work demonstrates significant improvements in data efficiency and model adaptability for real-world clinical scenarios.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 14839
Loading