Adversarial Attacks and Defenses in Vision-Language Pre- training: Techniques, Challenges and Opportunities

Adversarial Attacks and Defenses in Vision-Language Pre- training: Techniques, Challenges and Opportunities

TMLR Paper8534 Authors

21 Apr 2026 (modified: 25 Apr 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Vision-language pretraining (VLP) has emerged as a powerful paradigm for multimodal learning. However, despite their superior capabilities, VLPs remain vulnerable to adversarial attacks by manipulating their inputs. Such attacks by undermining user trust can significantly compromise their integrity, introduce critical security vulnerabilities and highlight the importance of securing VLPs to ensure safety in various real-world multimodal applications. In the adversarial landscape of VLPs, this review aims to delve into the methodologies and implications of both adversarial attacks and defense strategies, organized by architectural considerations. Our review delves into the complexities of categorizing adversarial attack strategies, underscoring the critical need for robust defensive measures. To improve the reliability of these models, we discuss novel defense mechanisms that counter vulnerabilities. In addition, we analyze how adversarial vulnerabilities impact downstream applications. Overall, this review aims to provide a comprehensive overview of adversarial threats in VLPs and present future research directions.

Submission Type: Long submission (more than 12 pages of main content)

Assigned Action Editor: ~Yang_Zhang15

Submission Number: 8534

Loading