Abstract: Deep neural networks (DNNs) are vulnerable to adversarial examples, which are maliciously crafted to fool DNN models. Adversarial examples often exhibit the property of transferability, which means that adversarial examples generated for one model can also fool another. Most existing transfer-based black-box attacks require the training data of the target model to improve the transferability of adversarial examples. However, it is difficult to obtain the training data in reality. In this paper, we propose a data-free black-box attack for crafting adversarial examples. The attack needs no knowledge about the training data of the target model. We first construct a dataset with a classification task similar to the target model. Then we train surrogate models based on the pretrained models by using transfer learning. Finally, the surrogate models are ensembled for generating adversarial examples to attack the target model. Extensive experiments show that the adversarial examples crafted by our method can effectively transfer to the target model. Our method outperforms the baselines and the best ensemble-based attack improves the attack success rates by a margin of 15% ∼ 21 %. Our research shows that DNN s are still at risk even if attackers cannot access the training data.
External IDs:dblp:conf/hpcc/WangZZKZZ22
Loading