RAD-SRAC: Simple Retrieval Augmented Classification for Radiology

Published: 07 Mar 2025, Last Modified: 25 Mar 2025GenAI4Health PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: small language models, vision language models, retrieval augmented generation, AI in radiology, clinical decision support, Generative AI in Medical Imaging, Multimodal AI for Healthcare
TL;DR: A training-free approach that improves medical image classification by combining specialized medical image encoders with few-shot prompting across X-ray, CT, and MRI modalities, leveling the playing field for small language models in healthcare
Abstract: The rapid advancement of artificial intelligence in healthcare has made automated medical image analysis increasingly crucial for improving diagnostic accuracy. Large Vision Language Models (VLMs) show promise in understanding medical imagery, but their reliance on static training data often leads to outdated or inaccurate information. Current approaches to medical image classification lack the specialized understanding required for complex medical diagnostics, relying on either text-based retrieval or general-purpose image encoders. We address these limitations by developing a novel training-free retrieval-augmented generation approach that combines a specialized medical image encoder with few-shot learning across multiple imaging modalities (X-ray, CT, and MRI). Our experiments across three diverse medical imaging datasets demonstrate substantial improvements in classification performance, with F1 score gains up to 142% for state-of-the-art VLMs and 250% for smaller deployable models while requiring only 3-5 retrieved reference images, leveling the playing field for on-premise clinical applications of smaller large language models.
Submission Number: 32
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview