CroPA++: Exposing Vulnerabilities in Vision Language Models and Enhancing Adversarial Transferability of Cross-Prompt Attacks
Keywords: Adversarial ML, Vision Language Model, Adversarial Transferability
TL;DR: This paper introduces CroPA++, a three-fold complementary enhancement to the existing prevalent method for Cross Prompt Attacks (CroPA) in VLMs to increase transferability across images and models, and higher ASR gains in cross-prompt scenarios
Abstract: Vision-Language Models (VLMs) enable image classification, captioning, and visual question answering, but remain vulnerable to adversarial perturbations especially when both visual and textual inputs can be manipulated. Cross-prompt attacks, which present a novel paradigm of adversarial attacks on VLMs, show that image perturbations can retain adversarial impact under diverse prompts, yet practical reliability is limited by sensitivity to initialization, poor cross-image generalization, and high compute cost relative to yield. We present three complementary enhancements: (1) Noise Initialization via semantically informed alignment, (2) Value-Vector Doubly-UAP Guidance that targets attention value vectors in the vision encoder, and (3) Cross-Image Universal Training using SCMix and CutMix. Evaluations on BLIP-2, InstructBLIP, LLaVA, and OpenFlamingo across VQA, captioning, and classification indicate consistent gains over prior methods in Attack Success Rate (ASR), stability, and transferability. Our code is available at https://anonymous.4open.science/r/CroPA-CD38
Submission Number: 207
Loading