Enhancing Transferability of Targeted Adversarial Examples via Inverse Target Gradient Competition and Spatial Distance Stretching
Abstract: In the field of AI security, deep neural networks (DNNs) are highly sensitive to adversarial examples (AEs), which can cause incorrect predictions with minimal input perturbations. Although AEs exhibit transferability across models, targeted attack success rates (TASRs) are low due to differences in feature dimensions and decision boundaries. To enhance targeted AE transferability, we propose a novel approach using Inverse Target Gradient Competition (ITC) and Spatial Distance Stretching (SDS) in the optimization process. Specifically, we employ a siamese-network-like framework to generate both non-targeted and targeted AEs. The ITC mechanism applies non-targeted adversarial gradients each epoch to impede the optimization of targeted perturbations, thereby improving robustness. Additionally, a top-$k$ SDS strategy guides AEs to penetrate target class regions in the latent space while distancing from non-targeted regions, achieving optimal transferability. Compared to state-of-the-art competition-based attacks, our method significantly improves transferable TASRs by 16.1% and 21.4% on mainstream CNNs and ViTs, respectively, and demonstrates superior defense-breaking capabilities.
Loading