Keywords: Adversarial robustness, channel suppressing, activation strategy.
Abstract: The study of adversarial examples and their activations have attracted significant attention for secure and robust learning with deep neural networks (DNNs). Different from existing works, in this paper, we highlight two new characteristics of adversarial examples from the channel-wise activation perspective: 1) the activation magnitudes of adversarial examples are higher than that of natural examples; and 2) the channels are activated more uniformly by adversarial examples than natural examples. We find that, while the state-of-the-art defense adversarial training has addressed the first issue of high activation magnitude via training on adversarial examples, the second issue of uniform activation remains. This motivates us to suppress redundant activations from being activated by adversarial perturbations during the adversarial training process, via a Channel-wise Activation Suppressing (CAS) training strategy. We show that CAS can train a model that inherently suppresses adversarial activations, and can be easily applied to existing defense methods to further improve their robustness. Our work provides a simplebut generic training strategy for robustifying the intermediate layer activations of DNNs.
One-sentence Summary: Training with Channel-wise Activation Suppressing (CAS) can help imrove the robustness of adversarial training.
Supplementary Material: zip
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Code: [![github](/images/github_icon.svg) bymavis/CAS_ICLR2021](https://github.com/bymavis/CAS_ICLR2021)
Data: [CIFAR-10](https://paperswithcode.com/dataset/cifar-10), [SVHN](https://paperswithcode.com/dataset/svhn)