CASh: Causality Alignment Shifting to Unveil Vulnerabilities in Visual-Language Model

ICLR 2026 Conference Submission15015 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Causality, VLMs, cross-attention matrix
Abstract: Existing adversarial attacks on vision-language models (VLMs) primarily use joint occurrence likelihoods to capture interdependency, often missing the true relationship between the text and the image.This paper presents a novel attack, CASh, on VLMs by manipulating latent causal representations between images and text in pre-trained models. We leverage the cross-attention matrix to capture causality alignment and exploit its singular properties to develop an efficient perturbation algorithm that modifies VLM tasks. Our attack targets the core causal relationships that exist independently of specific VLMs, ensuring transferability across models. Unlike existing attacks that primarily perturb inputs using correlation-based patterns, our approach accounts for causality, offering interpretability by showing how causal shifts lead to changes in VLM behavior. We evaluate CASh across various VLMs and compare it to existing attack methods. Our results demonstrate a significant performance boost, with an average improvement of 20.88\% in transferable attack capability.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 15015
Loading