Keywords: Adversarial Robustness, Adversarial Defense, Adversarial Training, Large perturbations
TL;DR: We propose methods to improve Adversarial Robustness at large perturbation bounds
Abstract: The vulnerability of Deep Neural Networks to Adversarial Attacks has fuelled research towards building robust models. While most existing Adversarial Training algorithms aim towards defending against imperceptible attacks, real-world adversaries are not limited by such constraints. In this work, we aim to achieve adversarial robustness at larger epsilon bounds. We first discuss the ideal goals of an adversarial defense algorithm beyond perceptual limits, and further highlight the shortcomings of naively extending existing training algorithms to higher perturbation bounds. In order to overcome these shortcomings, we propose a novel defense, Oracle-Aligned Adversarial Training (OA-AT), that attempts to align the predictions of the network with that of an Oracle during adversarial training. The proposed approach achieves state-of-the-art performance at large epsilon bounds ($\ell_\infty$ bound of $16/255$) while outperforming adversarial training algorithms such as AWP, TRADES and PGD-AT at standard perturbation bounds ($\ell_\infty$ bound of $8/255$) as well.
2 Replies
Loading