Improving Transferability of Adversarial Samples via Critical Region-Oriented Feature-Level Attack

Zhiwei Li; Min Ren; Qi Li; Fangling Jiang; Zhenan Sun

Improving Transferability of Adversarial Samples via Critical Region-Oriented Feature-Level Attack

Zhiwei Li, Min Ren, Qi Li, Fangling Jiang, Zhenan Sun

Published: 01 Jan 2024, Last Modified: 04 Nov 2024IEEE Trans. Inf. Forensics Secur. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Deep neural networks (DNNs) have received a lot of attention because of their impressive progress in computer vision. However, it has been recently shown that DNNs are vulnerable to being spoofed by carefully crafted adversarial samples. These samples are generated by specific attack algorithms that can obfuscate the target model without being detected by humans. Recently, feature-level attacks have been the focus of research due to their high transferability. Existing state-of-the-art feature-level attacks all improve the transferability by greedily changing the attention of the model. However, for images that contain multiple target class objects, the attention of different models may differ significantly. Thus greedily changing attention may cause the adversarial samples corresponding to these images to fall into the local optimum of the surrogate model. Furthermore, due to the great structural differences between vision transformers (ViTs) and convolutional neural networks (CNNs), adversarial samples generated on CNNs with feature-level attacks are more difficult to successfully attack ViTs. To overcome these drawbacks, we perform the Critical Region-oriented Feature-level Attack (CRFA) in this paper. Specifically, we first propose the Perturbation Attention-aware Weighting (PAW), which destroys critical regions of the image by performing feature-level attention weighting on the adversarial perturbations without changing the model attention as much as possible. Then we propose the Region ViT-critical Retrieval (RVR), which enables the generator to accommodate the transferability of adversarial samples on ViTs by adding extra prior knowledge of ViTs to the decoder. Extensive experiments demonstrate significant performance improvements achieved by our approach, i.e., improving the fooling rate by 19.9% against CNNs and 25.0% against ViTs as compared to state-of-the-art feature-level attack method.

Loading