A Unified Approach to Interpreting and Boosting Adversarial Transferability

Xin Wang; Jie Ren; Shuyun Lin; Xiangming Zhu; Yisen Wang; Quanshi Zhang

A Unified Approach to Interpreting and Boosting Adversarial Transferability

Xin Wang, Jie Ren, Shuyun Lin, Xiangming Zhu, Yisen Wang, Quanshi Zhang

Published: 12 Jan 2021, Last Modified: 05 May 2023ICLR 2021 PosterReaders: Everyone

Keywords: Adversarial Learning, Interpretability, Adversarial Transferability

Abstract: In this paper, we use the interaction inside adversarial perturbations to explain and boost the adversarial transferability. We discover and prove the negative correlation between the adversarial transferability and the interaction inside adversarial perturbations. The negative correlation is further verified through different DNNs with various inputs. Moreover, this negative correlation can be regarded as a unified perspective to understand current transferability-boosting methods. To this end, we prove that some classic methods of enhancing the transferability essentially decease interactions inside adversarial perturbations. Based on this, we propose to directly penalize interactions during the attacking process, which significantly improves the adversarial transferability. We will release the code when the paper is accepted.

One-sentence Summary: We prove the close relationship between the interaction and adversarial transferability, provide a unified explanation for previous transferability-boosting methods, and develop a loss to improve adversarial transferability.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

14 Replies

Loading