Abstract: Neural classifiers have achieved near human level performances when applied to several real-world tasks. Despite their successes, recent works have demonstrated their vulnerability to adversarial attacks. In particular, image classifiers have shown to be vulnerable to fine-tuned noise that perturb a small number of pixels, known as sparse attacks. To generate such perturbations current works either prioritise query efficiency by allowing the size of the perturbation to be unbounded or the minimization of its size by allowing a large number of pixels to be perturbed. Addressing the drawbacks of both approaches we propose a method of conducting query efficient sparse adversarial attacks that minimizes the number of perturbed pixels by formulating the attack as a constrained bi-objective optimization problem. Within the single objective unbounded query-efficient scenario our method is able to outperform state-of-the-art sparse attack algorithms in terms of success rate and query efficiency. When also minimizing the number of perturbed pixels in the bi-objective setting, the proposed method is able to generate adversarial perturbations that impact a fewer number of pixels than its state-of-the-art competitors.
0 Replies
Loading