RadTextAid: A CNN-Guided Framework Utilizing Lightweight Vision-Language Models for Assistive Radiology Reporting

Mahmud Wasif Nafee; Tasmia Rahman Aanika; Taufiq Hasan

RadTextAid: A CNN-Guided Framework Utilizing Lightweight Vision-Language Models for Assistive Radiology Reporting

Mahmud Wasif Nafee, Tasmia Rahman Aanika, Taufiq Hasan

Published: 07 Mar 2025, Last Modified: 25 Mar 2025GenAI4Health PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multi-modal vision language model, chest X-ray report generation

TL;DR: Our study proposes a novel framework using CNN-guided labels and lightweight vision language models for automatically generating radiology reports.

Abstract: Deciphering chest X-rays is crucial for diagnosing thoracic diseases such as pneumonia, lung cancer, and cardiomegaly. Radiologists often work under significant workloads and handle large volumes of data, which can lead to exhaustion and burnout. Advanced deep learning models can effectively generate draft radiology reports, potentially alleviating the radiologist's workload. However, many current systems create reports that include clinically irrelevant or redundant information. To address these limitations, we propose RadTextAid, a novel multi-modal framework for generating high-quality, clinically relevant radiology reports. Our approach integrates VLMs for natural language generation, augmented by disease-specific tags derived from a CNN analyzing chest X-ray images to identify key pathological features. A key feature within our framework is the pre-processing of the radiology report training dataset. This removes routine, repetitive, or non-informative phrases commonly found in chest X-ray reports and ensures that the model focuses its learning on clinically meaningful content, which expert radiologists qualitatively validated. Experimental results show that our system yields an absolute improvement of 4.8% in terms of BERTScore and 3.16% in terms of the F1-cheXbert metric compared to a state-of-the-art model. Thus, the results demonstrate that the proposed RadTextAid framework not only improves the detection of abnormalities from chest X-ray images but also enhances the overall quality and coherence of generated reports, thus paving the way toward more efficient and effective radiology reporting.

Submission Number: 64

Loading