Keywords: Data augmentation, Transferable targeted attacks
Abstract: Diverse input patterns induced by data augmentations prevent crafted adversarial perturbations from over-fitting to white-box models, hence improving the transferability of adversarial examples for non-targeted attacks. Nevertheless, current data augmentation methods usually perform unsatisfactory for transferable targeted attacks. In this paper, we revisit the commonly used data augmentation method - DI, which is originally proposed to improve non-targeted transferability and discover that its unsatisfactory performance in targeted transferability is mainly caused by the unreasonable restricted diversity. Besides, we also show that directly increasing the diversity of input patterns offers better transferability. In addition, our analysis of attention heatmaps suggests that incorporating more diverse input patterns into optimizing perturbations enlarges the discriminative regions of the target class in the white-box model. Therefore, these generated perturbations can activate discriminative regions of other models with high probabilities. Motivated by this observation, we propose to optimize perturbations with a set of augmented images that have various discriminative regions of the target class in the white-box model. Specifically, we design a data augmentation method, which includes multiple image transformations that can significantly change discriminative regions of the target class, to improve transferable targeted attacks by a large margin. On the ImageNet-compatible dataset, our method achieves an average of 92.5\% targeted attack success rate in the ensemble transfer scenario, shedding light on transfer-based targeted attacks.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Social Aspects of Machine Learning (eg, AI safety, fairness, privacy, interpretability, human-AI interaction, ethics)
Supplementary Material: zip
9 Replies
Loading