- TL;DR: Cnvolutional neural networks characterization for backdoored classifier detection and understanding.
- Abstract: We propose a new representation, one-pixel signature, that can be used to reveal the characteristics of the convolution neural networks (CNNs). Here, each CNN classifier is associated with a signature that is created by generating, pixel-by-pixel, an adversarial value that is the result of the largest change to the class prediction. The one-pixel signature is agnostic to the design choices of CNN architectures such as type, depth, activation function, and how they were trained. It can be computed efficiently for a black-box classifier without accessing the network parameters. Classic networks such as LetNet, VGG, AlexNet, and ResNet demonstrate different characteristics in their signature images. For application, we focus on the classifier backdoor detection problem where a CNN classifier has been maliciously inserted with an unknown Trojan. We show the effectiveness of the one-pixel signature in detecting backdoored CNN. Our proposed one-pixel signature representation is general and it can be applied in problems where discriminative classifiers, particularly neural network based, are to be characterized.
- Keywords: Neural Network characterization, backdoor detection, one-pixel signature