Abstract: Deep neural networks (DNNs) are deemed vulnerable to adversarial examples (AEs). Transfer-based attacks enable attackers to craft adversarial images based on local surrogate models without feedback from remote ones. One of the promising attacks is to distract the attention map of the surrogate model that is likely to be shared among remote models. However, we find that the attention maps calculated from a local model are usually over-focus on the most critical area, which limits the transferability of the attacks. In response to this challenge, we propose an enhanced image transformation method (EIT), which guides adversarial perturbations to distract not only the most critical area but also other relevant regions. The proposed EIT effectively mitigates the differences in attention maps between multiple models and better neutralizes model-specific features, thereby avoiding getting stuck in local optima specific to the surrogate model. Experiments confirm the superiority of our approach to the state-of-the-art benchmarks. Our implementation is available at: github.com/britney-code/EIT-attack.
Loading