Logit Regularization Methods for Adversarial Robustness

Cecilia Summers; Michael J. Dinneen

Logit Regularization Methods for Adversarial Robustness

Cecilia Summers, Michael J. Dinneen

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Withdrawn SubmissionReaders: Everyone

Abstract: While great progress has been made at making neural networks effective across a wide range of tasks, many are surprisingly vulnerable to small, carefully chosen perturbations of their input, known as adversarial examples. In this paper, we advocate for and experimentally investigate the use of logit regularization techniques as an adversarial defense, which can be used in conjunction with other methods for creating adversarial robustness at little to no cost. We demonstrate that much of the effectiveness of one recent adversarial defense mechanism can be attributed to logit regularization and show how to improve its defense against both white-box and black-box attacks, in the process creating a stronger black-box attacks against PGD-based models.

Keywords: adversarial

TL;DR: Logit regularization methods help explain and improve state of the art adversarial defenses

19 Replies

Loading