Keywords: Representation learning, Missing Information in Deep Networks, Gradient-based Representation
TL;DR: We propose a gradient-based representation for characterizing information that deep networks have not learned.
Abstract: Deep networks face challenges of ensuring their robustness against inputs that cannot be effectively represented by information learned from training data. We attribute this vulnerability to the limitations inherent to activation-based representation. To complement the learned information from activation-based representation, we propose utilizing a gradient-based representation that explicitly focuses on missing information. In addition, we propose a directional constraint on the gradients as an objective during training to improve the characterization of missing information. To validate the effectiveness of the proposed approach, we compare the anomaly detection performance of gradient-based and activation-based representations. We show that the gradient-based representation outperforms the activation-based representation by 0.093 in CIFAR-10 and 0.361 in CURE-TSR datasets in terms of AUROC averaged over all classes. Also, we propose an anomaly detection algorithm that uses the gradient-based representation, denoted as GradCon, and validate its performance on three benchmarking datasets. The proposed method outperforms the majority of the state-of-the-art algorithms in CIFAR-10, MNIST, and fMNIST datasets with an average AUROC of 0.664, 0.973, and 0.934, respectively.
Data: [CIFAR-10](https://paperswithcode.com/dataset/cifar-10), [MNIST](https://paperswithcode.com/dataset/mnist)
Original Pdf: pdf