When Background Matters: Breaking Medical Vision Language Models by Transferable Attack

When Background Matters: Breaking Medical Vision Language Models by Transferable Attack

ACL ARR 2026 January Submission5315 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Vision Language Models(VLMs), Medical Images, Adversarial Transferable Attacks

Abstract: Vision-Language Models (VLMs) are increasingly used in clinical diagnostics, but their robustness to adversarial attacks is largely unexplored, posing serious risks. Existing medical image attacks mostly target secondary goals like model stealing or adversarial finetuning, while vanilla transferable attacks from natural images fail by introducing visible distortions that are easily detectable by clinicians. To address this, we propose \textit{\textbf{MedFocusLeak}}, a novel and highly transferable black-box multimodal attack that forces incorrect medical diagnoses while ensuring perturbations remain imperceptible. The approach strategically introduces synergistic perturbations into non-diagnostic background regions of a medical image and uses an Attention-Distract loss to deliberately shift the model’s diagnostic focus away from pathological areas. Through comprehensive evaluations on 6 distinct medical imaging modalities, we demonstrate that MedFocusLeak attains state-of-the-art effectiveness, producing adversarial examples that elicit plausible but incorrect diagnostic outputs across a range of VLMs. We also propose a novel evaluation framework with new metrics that capture both the success of the misleading text generation and the quality preservation of the medical image in one statistical number. Our findings expose a systematic weakness in the reasoning capabilities of contemporary VLMs in clinical settings.

Paper Type: Long

Research Area: Clinical and Biomedical Applications

Research Area Keywords: Clinical and Biomedical Applications , Safety and Alignment in LLMs

Contribution Types: NLP engineering experiment

Languages Studied: English

Submission Number: 5315

Loading