Visual Semantics Meets Medical Diagnosis: Cross-Scale Embedding Alignment for Clinically Explainable Medical Image Segmentation
Keywords: Explainable AI, Vision-Language Models, Medical Image Segmentation, Vision-Encoders, Interpretable AI
TL;DR: A gradient-free XAI framework that aligns visual embeddings across scales to generate anatomically and clinically meaningful explanations for vision-language models in medical imaging.
Abstract: Abstract of Paper
Medical image segmentation requires explainable AI for clinical deployment, yet vision-language models like MedSAM (Ma et al., 2024) operate as black boxes. Existing methods like Grad-CAM (Selvaraju et al., 2017) suffer from computational instability and fail to capture multi-modal feature interactions. We present a gradient-free framework generating anatomically-aligned saliency maps across embedding layers via calculated similarity between image features and reference representations. Our three-level methodology progresses from derived insights from image embeddings to organ prototype similarity, prompt-spatial embeddings to a four-component spatial system. Evaluated on CHAOS (Kavur et al., 2021) and FLARE22 (Ma et al., 2023) datasets (13 organs), our approach reveals progressive reasoning: early layers show broad attention, intermediate layers narrow to organ-specific regions, and final layers produce precise boundary identification, enabling clinicians to verify model decisions against medical expertise.
⸻
References
Ma, J., Wang, X., et al. (2024). Segment Anything in Medical Images. arXiv preprint arXiv:2404.02643.
Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization. Proceedings of the IEEE International Conference on Computer Vision (ICCV).
Kavur, A. E., Gezer, N. S., Barış, M., Aslan, S., Conze, P.-H., Groza, V., Pham, D. L., Chatterjee, S., Ernst, P., Galdran, A., Karakas, M., Akin, S., Birgi, E., Ture, U., & Selver, M. A. (2021). CHAOS Challenge – Combined (CT-MR) Healthy Abdominal Organ Segmentation. Medical Image Analysis, 69, 101950.
Ma, J., et al. (2023). Unleashing the Strengths of Unlabeled Data in Pan-cancer Abdominal Organ Quantification: The FLARE22 Challenge. arXiv preprint arXiv:2308.05862.
Submission Number: 29
Loading