Typography Leads Semantic Diversifying: Amplifying Adversarial Transferability across Multimodal Large Language Models

Hao Cheng; Erjia Xiao; Jiayan Yang; Jiahang Cao; Qiang Zhang; Le Yang; Jize Zhang; Kaidi Xu; Jindong Gu; Renjing Xu

Typography Leads Semantic Diversifying: Amplifying Adversarial Transferability across Multimodal Large Language Models

Hao Cheng, Erjia Xiao, Jiayan Yang, Jiahang Cao, Qiang Zhang, Le Yang, Jize Zhang, Kaidi Xu, Jindong Gu, Renjing Xu

24 Sept 2024 (modified: 14 Dec 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Adversarial Transferability; Multimodal Large Language Models; Data Augmentation

Abstract: Recently, Multimodal Large Language Models (MLLMs) have demonstrated exceptional performance in zero-shot tasks through their advanced cross-modal interaction and comprehension abilities. Despite these capabilities, MLLMs remain vulnerable to human-imperceptible adversarial examples. In real-world scenarios, the transferability of adversarial examples, which enables cross-model impact, is considered their most significant threat. However, systematic research on the threat of cross-MLLM adversarial transferability is currently lacking. Therefore, this paper serves as the first step toward a comprehensive evaluation of the transferability of adversarial examples generated by various MLLMs. Furthermore, we leverage two critical factors that significantly impact transferability: 1) the degree of information diversity involved in the adversarial generation; 2) the integration of cross vision-language modality editing. We propose a boosting method, the Typography Augment Transferability Method (TATM), to explore adversarial transferability across MLLMs. Through extensive experimental validation, our TATM demonstrates exceptional performance in real-world applications of Harmful Word Insertion and Important Information Protection.

Supplementary Material: zip

Primary Area: applications to computer vision, audio, language, and other modalities

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 3659

Loading