Abstract: In image, video and even real physics domains, adversarial examples can mislead deep models to produce wrong predictions. Transfer-based attacks against black-box models are more in line with realistic scenarios, but adversarial examples made on surrogate model have a low success rate when transferred to the target model due to overfitting the source model. We study the Stochastic Weight Averaging strategy in the domain generalization process and propose a Stochastic Perturbation Averaging method (SPA). Specifically, we add stochastic perturbations to the examples during the gradient descent attack, and we design a Central Amplification method (CAM) to enhance this random variation, then SPA stabilizes the iteration direction by computing the gradient average of the perturbed examples to find a relatively flat local minimum of the loss function. SPA is an efficient and general strategy which can significantly improve the transferability of the gradient-based attack methods. For instance, the average attack success rate of the adversarial examples produced based on four single models against seven pre-trained models reached 90.10%, which is the best result so far. Code is available at https://github.con yangrongbo/SPA.
Loading