Abstract: We propose a voting ensemble of models trained by using block-wise transformed images with secret keys against black-box attacks. Although key-based adversarial defenses were effective against gradient-based (white-box) attacks, they cannot defend against gradient-free (black-box) attacks without requiring any secret keys. In the proposed ensemble, a number of models are trained by using images transformed with different keys and block sizes, and then a voting ensemble is applied to the models. Experimental results show that the proposed defense achieves a clean accuracy of 95.56 % and an attack success rate of less than 9 % under attacks with a noise distance of 8/255 on the CIFAR-10 dataset.
Loading