Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Replacing Loss Functions And Target Representations For Adversarial Defense
Sean Saito, Sujoy Roy
Feb 12, 2018 (modified: Feb 13, 2018)ICLR 2018 Workshop Submissionreaders: everyone
Abstract:Recent works have shown that neural networks are susceptible to adversarial data, despite demonstrating high performance across various tasks. Hence, there is a growing need to develop techniques that make neural networks more robust against attacks given their increasingly frequent applications in real-life use cases. In this work, we propose simple techniques for adversarial defense, namely: (1) changing the loss function from cross entropy to mean-squared error, (2) representing targets as codewords generated from random codebooks, and (3) using an autoencoder to filter noisy logits before the final activation layer. Our experiments on CIFAR-10 using the DenseNet model have shown that these techniques can help prevent targeted attacks as well as improve classification accuracy on adversarial data generated in a white-box or black-box setting.
TL;DR:Changing the loss function and target representation along with adding an autoencoder layer can significantly improve resistance to adversarial attacks
Keywords:adversarial attacks, target representation, loss function
Enter your feedback below and we'll get back to you as soon as possible.