Defensive Quantization Layer For Convolutional Network Against Adversarial AttackDownload PDF

25 Sep 2019 (modified: 24 Dec 2019)ICLR 2020 Conference Withdrawn SubmissionReaders: Everyone
  • Original Pdf: pdf
  • Keywords: quantization, adversarial example, robustness, convolutional neural network, concept
  • TL;DR: We propose a quantization-based method which regularizes a CNN's learned representations to be automatically aligned with trainable concept matrix hence effectively filtering out adversarial perturbations.
  • Abstract: Recent research has intensively revealed the vulnerability of deep neural networks, especially for convolutional neural networks (CNNs) on the task of image recognition, through creating adversarial samples which `"slightly" differ from legitimate samples. This vulnerability indicates that these powerful models are sensitive to specific perturbations and cannot filter out these adversarial perturbations. In this work, we propose a quantization-based method which enables a CNN to filter out adversarial perturbations effectively. Notably, different from prior work on input quantization, we apply the quantization in the intermediate layers of a CNN. Our approach is naturally aligned with the clustering of the coarse-grained semantic information learned by a CNN. Furthermore, to compensate for the loss of information which is inevitably caused by the quantization, we propose the multi-head quantization, where we project data points to different sub-spaces and perform quantization within each sub-space. We enclose our design in a quantization layer named as the Q-Layer. The results obtained on MNIST and Fashion-MNSIT datasets demonstrate that only adding one Q-Layer into a CNN could significantly improve its robustness against both white-box and black-box attacks.
  • Code:
5 Replies