Keywords: Adversarial Training, Image Recognition, Batch Normalization, Robustness, Generalization
TL;DR: This work proposes an adversarial training method resisting adversarially perturbed statistics of Batch Normalization, to improve recognition on benign images.
Abstract: Recently, it has been shown that adversarial training (AT) by injecting adversarial samples can improve the quality of recognition. However, the existing AT methods suffer from the performance degradation on the benign samples, leading to a gap between robustness and generalization. We argue that this gap is caused by the inaccurate estimation of the Batch Normalization (BN) layer, due to the distributional discrepancy between the training and test set. To bridge this gap, this paper identifies the adversarial robustness against the indispensable noise in BN statistics. In particular, we proposed a novel strategy that adversarially perturbs the BN layer, termed ARAPT. The ARAPT leverages the gradients to shift BN statistics and helps models resist the shifted statistics to enhance robustness to noise. Then, we introduce ARAPT into a new paradigm of AT called model-based AT, which strengthens models' tolerance to noise in BN. Experiments indicate that the APART can improve model generalization, leading to significant improvements in accuracy on benchmarks like CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet.
Supplementary Material: pdf
14 Replies
Loading