Improving the Transferability of Adversarial Examples by Feature Augmentation

Published: 2025, Last Modified: 08 Jan 2026IEEE Trans. Neural Networks Learn. Syst. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Adversarial transferability is a significant property of adversarial examples, which renders the adversarial example capable of attacking unknown models. However, the models with different architectures on the same task would concentrate on different information, which weakens adversarial transferability. To enhance the adversarial transferability, input transformation-based attacks perform random transformation over input to find a better result that can resist such transformations, but these methods ignore the model discrepancy; ensemble attacks fuse multiple models to shrink the search space to ensure that the found adversarial examples work on these models, but ensemble attacks are resource-intensive. In this article, we propose a simple but effective feature augmentation attack (FAUG) method to improve adversarial transferability. We dynamically add random noise to intermediate features of the target model during the generation of adversarial examples, thereby avoiding overfitting the target model. Specifically, we first explore the noise tolerance of the model and disclose the discrepancy under different layers and noise strengths. Then, based on that analysis, we devise a dynamic random noise generation method, which determines noise strength according to the produced features in the mini-batch. Finally, we exploit the gradient-based attack algorithm on featureaugmented models, resulting in better adversarial transferability without introducing extra computation costs. Extensive experiments conducted on the ImageNet dataset across CNN and Transformer models corroborate the efficacy of our method, e.g., we achieve improvement of +30.67% and +5.57% on input transformation-based attacks and combination methods, respectively.
Loading