Boosting the Transferability of Adversarial Attacks with Reverse Adversarial Perturbation

Zeyu Qin; Yanbo Fan; Yi Liu; Yong Zhang; Jue Wang; Baoyuan Wu

Boosting the Transferability of Adversarial Attacks with Reverse Adversarial Perturbation

Zeyu Qin, Yanbo Fan, Yi Liu, Yong Zhang, Jue Wang, Baoyuan Wu

Published: 28 Jan 2022, Last Modified: 22 Oct 2023ICLR 2022 SubmittedReaders: Everyone

Keywords: Adversarial Examples, Black-Box Attacks, Adversarial Transferability

Abstract: Deep neural networks (DNNs) have shown to be vulnerable to adversarial examples, which can produce erroneous predictions by injecting imperceptible perturbations. In this work, we study the transferability of adversarial examples, which is of significant due to its threat to real-world applications where model architecture or parameters are usually unknown. Many existing works reveal that the adversarial examples are likely to overfit the surrogate model that they are generated from, limiting its transfer attack performance against different target models. Inspired by the connection between the flatness of loss landscape and the model generalization, we propose a novel attack method, dubbed reverse adversarial perturbation (RAP) to boost the transferability of adversarial examples. Specifically, instead of purely minimizing the adversarial loss at a single adversarial point, we advocate seeking adversarial examples locating at the low-value and flat region of the loss landscape, through injecting the worst-case perturbation, the reverse adversarial perturbation, for each step of the optimization procedure. The adversarial attack with RAP is formulated as a min-max bi-level optimization problem. Comprehensive experimental comparisons demonstrate that RAP can significantly boost the adversarial transferability. Furthermore, RAP can be naturally combined with many existing black-box attack techniques, to further boost the transferability. When attacking a real-world image recognition system, Google Cloud Vision API, we obtain 22% performance improvement of targeted attacks over the compared method.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2210.05968/code)

23 Replies

Loading