MedGazeShift : Transferable Multimodal Adversarial Attacks for Diagnostic Misdirection in Vision-Language Models

MedGazeShift : Transferable Multimodal Adversarial Attacks for Diagnostic Misdirection in Vision-Language Models

ICLR 2026 Conference Submission19304 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Vision Language Models(VLMs), Medical Images, Adversarial Transferable Attacks

TL;DR: Adversarial Attacks on Medical Images for VLMs

Abstract: Vision-Language Models (VLMs) are increasingly used in clinical diagnostics, but their robustness to adversarial attacks is largely unexplored, posing serious risks. Existing medical image attacks mostly target secondary goals like model stealing, while transferable attacks from natural images fail by introducing visible distortions that are easily detectable by clinicians. To address this, we propose \textbf{\textit{MedGazeShift}}, a novel and highly transferable black-box multimodal attack that forces incorrect medical diagnoses while ensuring perturbations remain imperceptible. The approach strategically introduces synergistic perturbations into non-diagnostic background regions of an image and uses an Attention-Distract loss to deliberately shift the model’s diagnostic focus away from pathological areas. Through comprehensive evaluations on six distinct medical imaging modalities, we demonstrate that \textbf{\textit{MedGazeShift}} attains state-of-the-art effectiveness, producing adversarial examples that elicit plausible but incorrect diagnostic outputs across a range of VLMs. We also propose a novel evaluation framework with new metrics that capture both the success of the misleading text generation and the quality preservation of the medical image in one statistical number. Our findings expose a systematic weakness in the reasoning capabilities of contemporary VLMs in clinical settings. More broadly, our work shows that insights into model internals, such as attention, can inform practical control methods and support safer deployment of multimodal systems.

Supplementary Material: zip

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 19304

Loading