Revisiting CroPA: A Reproducibility Study and Enhancements for Cross-Prompt Adversarial Transferability in Vision-Language Models
Abstract: Large Vision-Language Models (VLMs) have revolutionized computer vision, enabling tasks such as image classification, captioning, and visual question answering. However, they remain highly vulnerable to adversarial attacks, particularly in scenarios where both visual and textual modalities can be manipulated. In this study, we conduct a comprehensive reproducibility study of "An Image is Worth 1000 Lies: Adversarial Transferability Across Prompts on Vision-Language Models" validating the Cross-Prompt Attack (CroPA) and confirming its superior cross-prompt transferability compared to existing baselines. Beyond replication we propose several key improvements: (1) A novel initialization strategy that significantly improves Attack Success Rate (ASR). (2) Investigate cross-image transferability by learning universal perturbations. (3) A novel loss function targeting vision encoder attention mechanisms to improve generalization. Our evaluation across prominent VLMs—including Flamingo, BLIP-2, and InstructBLIP validates the original results and demonstrates that our improvements consistently boost adversarial effectiveness. Our work reinforces the importance of studying adversarial vulnerabilities in VLMs and provides a more robust framework for generating transferable adversarial examples, with significant implications for understanding the security of VLMs in real-world applications.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: Since the last submission, we have updated the results for both the reproducibility study and our proposed extensions. In the original submission, computational constraints prevented us from completing all experiments; however, we have now successfully addressed these limitations and present a more comprehensive study. Specifically:
(1) We have completed and reported results for all major claims of the original paper, validating the effectiveness of the Cross-Prompt Attack (CroPA) across multiple vision-language tasks.
(2) We have reported and analyzed the results for our extensions and provided insightful conclusions for these results.
(3) A Broader Impact statement and a comprehensive appendix section was also added to ensure a more complete and well-supported study
(4) Minor revisions in the writing of the Introduction as well as Section 4 were also added strengthening both the reproducibility findings and our novel contributions.
Assigned Action Editor: ~Dit-Yan_Yeung2
Submission Number: 4323
Loading