Keywords: Universal Adversarial Perturbations, Ordered Top-K AttacKs, AllAttacK
TL;DR: Revisiting and expanding universal adversarial perturbations jointly along model, data and target dimensions
Abstract: Universal adversarial perturbations (UAPs) have deepened the vulnerability concern of Deep Neural Networks (DNNs) after the initial intriguing discovery of vanilla single-model-single-image adversarial attacks. However, the landscape of UAPs has not been thoroughly investigated. In this paper, we revisit and expand UAPs for white-box targeted attacks along three axes simultaneously: the model-axis, the data-axis, and the target-axis. For the target-axis, we adopt the most aggressive ordered top-$K$ attack protocol ($K\geq 1$) to expand the traditional top-$1$ attack setting in the prior art of learning UAPs. Our proposed method is thus dubbed as AllAttacK.
In implementation, our AllAttacK is built on two state-of-the-art single-model-single-image ordered top-$K$ attack methods, the KL divergence based adversarial distillation method and the more recently proposed quadratic programming based method. We propose a simple yet effective joint mini-data-batch and mini-model-batch optimization strategy in learning UAPs for a large number of models (e.g., up to 18 disparate DNNs) and a large number of images (e.g., 1000 images). We test our AllAttacK on the ImageNet-1k classification task using an ensemble of disparate models such as Convolutional Neural Networks and their adversarially-robustified versions, Vision Transformers, CLIP vision encoders, and MLP-Mixers. Our learned AllAttacK perturbations are doubly transferable across training and testing models, and across training and testing images, and they also show intriguing yet sensible looking.
Supplementary Material: zip
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8147
Loading