Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Thermometer Encoding: One Hot Way To Resist Adversarial Examples
Jacob Buckman, Aurko Roy, Colin Raffel, Ian Goodfellow
Feb 15, 2018 (modified: Feb 23, 2018)ICLR 2018 Conference Blind Submissionreaders: everyoneShow Bibtex
Abstract:It is well known that it is possible to construct "adversarial examples"
for neural networks: inputs which are misclassified by the network
yet indistinguishable from true data. We propose a simple
modification to standard neural network architectures, thermometer
encoding, which significantly increases the robustness of the network to
adversarial examples. We demonstrate this robustness with experiments
on the MNIST, CIFAR-10, CIFAR-100, and SVHN datasets, and show that
models with thermometer-encoded inputs consistently have higher accuracy
on adversarial examples, without decreasing generalization.
State-of-the-art accuracy under the strongest known white-box attack was
increased from 93.20% to 94.30% on MNIST and 50.00% to 79.16% on CIFAR-10.
We explore the properties of these networks, providing evidence
that thermometer encodings help neural networks to
find more-non-linear decision boundaries.
TL;DR:Input discretization leads to robustness against adversarial examples