Keywords: adversarial machine learning, patch attack, certifiable defense, randomized crop
TL;DR: This paper proposes a new defense against patch attack which decomposes an image into a random set of crops, each of which is processed by a classifier, and the majority across the crops is used as the classification for the input image.
Abstract: This paper proposes a certifiable defense against adversarial patch attacks on image classification. Our approach classifies random crops from the original image independently and classifies the original image as the majority vote over predicted classes of the crops. Leveraging the fact that a patch attack can only influence a certain number of pixels in the image, we derive certified robustness bounds for the classifier. Our method is particularly effective when realistic transformations are applied to the adversarial patch, such as affine transformations. Such transformations occur naturally when an adversarial patch is physically introduced in a scene. Our method improves upon the current state of the art in defending against patch attacks on CIFAR10 and ImageNet, both in terms of certified accuracy and inference time.
2 Replies
Loading