Stochastic Perturbation Averaging Boosts Transferability of Adversarial Examples

Published: 01 Jan 2023, Last Modified: 12 Apr 2025DSAA 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In image, video and even real physics domains, adversarial examples can mislead deep models to produce wrong predictions. Transfer-based attacks against black-box models are more in line with realistic scenarios, but adversarial examples made on surrogate model have a low success rate when transferred to the target model due to overfitting the source model. We study the Stochastic Weight Averaging strategy in the domain generalization process and propose a Stochastic Perturbation Averaging method (SPA). Specifically, we add stochastic perturbations to the examples during the gradient descent attack, and we design a Central Amplification method (CAM) to enhance this random variation, then SPA stabilizes the iteration direction by computing the gradient average of the perturbed examples to find a relatively flat local minimum of the loss function. SPA is an efficient and general strategy which can significantly improve the transferability of the gradient-based attack methods. For instance, the average attack success rate of the adversarial examples produced based on four single models against seven pre-trained models reached 90.10%, which is the best result so far. Code is available at https://github.con yangrongbo/SPA.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview