Maximizing entropy on adversarial examples can improve generalization

Amrith Setlur; Benjamin Eysenbach; Virginia Smith; Sergey Levine

Maximizing entropy on adversarial examples can improve generalization

Amrith Setlur, Benjamin Eysenbach, Virginia Smith, Sergey Levine

Published: 25 Mar 2022, Last Modified: 23 May 2023ICLR 2022 PAIR^2Struct OralReaders: Everyone

Keywords: regularization, generalization, entropy, adversarial

TL;DR: Maximizing entropy on adversarial examples can improve generalization

Abstract: Supervised classification methods that directly optimize maximize the likelihood of the training data often overfit. This overfitting is typically mitigated through regularizing the loss function (e.g., label smoothing, weight decay) or by minimizing the same loss on new examples (e.g., data augmentation, adversarial training). In this work, we propose a complementary regularization strategy: training the model to be unconfident on examples that are generated so they have unclear labels. We call our approach Maximum Predictive Entropy (MPE). These automatically generated examples are cheap to compute, so our method is only 30% slower than standard data augmentation. Adding MPE to existing regularization techniques, such as label smoothing, increases test accuracy by 1-3%, with larger gains in the small data regime.

0 Replies

Loading