Manifold Regularization for Locally Stable Deep Neural NetworksDownload PDF

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone
Keywords: regularization, deep learning, adversarial robustness
Abstract: We apply concepts from manifold regularization to develop new regularization techniques for training locally stable deep neural networks. Our regularizers encourage functions which are smooth not only in their predictions but also their decision boundaries. Empirically, our networks exhibit stability in a diverse set of perturbation models, including $\ell_2$, $\ell_\infty$, and Wasserstein-based perturbations; in particular, against a state-of-the-art PGD adversary, a single model achieves both $\ell_\infty$ robustness of 40% at $\epsilon = 8/255$ and $\ell_2$ robustness of 48% at $\epsilon = 1.0$ on CIFAR-10. We also obtain state-of-the-art verified accuracy of 21% in the same $\ell_\infty$ setting. Furthermore, our techniques are efficient, incurring overhead on par with two additional parallel forward passes through the network; in the case of CIFAR-10, we achieve our results after training for only 3 hours, compared to more than 70 hours for standard adversarial training.
One-sentence Summary: We derive new manifold regularizers for deep neural networks that are (1) cheap to train with and (2) yield stability against a variety of (unseen) perturbations models.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Reviewed Version (pdf): https://openreview.net/references/pdf?id=V80MTyFuuR
26 Replies

Loading