Abstract: Deep neural networks can be fooled by small imperceptible perturbations called adversarial
examples. Although these examples are carefully crafted, they involve two major concerns. In some cases,
adversarial examples generated are much larger than minimal adversarial perturbations while in others the
attack method involves an extensive number of iterations making it infeasible. Moreover, the sparse attacks
are either too complex or are not sparse enough to achieve imperceptibility. Therefore, attacks designed
should be fast and minimum in terms of ℓ2-norm. In this research, we used a dictionary learning technique
to generate sparse adversarial examples based on feature maps of target images. We present two novel
algorithms to tune the dictionary learning process and feature map selection. The results on MNIST and
Imagenet show our attack is better or competitive with the state-of-the-art methods. We also compared our
method with sparse attacks recently introduced in literature. As a result, we have achieved comparable attack
success rate when compared to the state-of-the-art with smaller ℓ2-norm. We also tested the efficacy of our
attack in the presence of defense mechanisms and none of the defenses were able to combat the effect of
our proposed attack
0 Replies
Loading