Abstract: Medical Vision-Language Models (Med-VLMs) are gaining popularity in different medical tasks, such as visual question-answering (VQA), captioning, and diagnosis support. However, despite their impressive performance, Med-VLMs remain vulnerable to adversarial attacks, much like their general-purpose counterparts. In this work, we investigate the cross-prompt transferability of adversarial attacks on Med-VLMs in the context of VQA. To this end, we propose a novel adversarial attack algorithm that operates in the frequency domain of images and employs a learnable text context within a max-min competitive optimization framework, enabling the generation of adversarial perturbations that are transferable across diverse prompts. Evaluation on three Med-VLMs and four Med-VQA datasets shows that our approach outperforms the baseline, achieving an average attack success rate of \(67\%\) (compared to baseline’s \(62\%\)).
External IDs:dblp:conf/miccai/HanifZKKA25
Loading