An Algorithm for Out-Of-Distribution Attack to Neural Network Encoder

Liang Liang; Linhai Ma; Linchen Qian; Jiasong Chen

An Algorithm for Out-Of-Distribution Attack to Neural Network Encoder

Liang Liang, Linhai Ma, Linchen Qian, Jiasong Chen

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: Out-Of-Distribution, DNN, image classification

Abstract: Deep neural networks (DNNs), especially convolutional neural networks, have achieved superior performance on image classification tasks. However, such performance is only guaranteed if the input to a trained model is similar to the training samples, i.e., the input follows the probability distribution of the training set. Out-Of-Distribution (OOD) samples do not follow the distribution of training set, and therefore the predicted class labels on OOD samples become meaningless. Classiﬁcation-based methods have been proposed for OOD detection; however, in this study we show that this type of method has no theoretical guarantee and is practically breakable by our OOD Attack algorithm because of dimensionality reduction in the DNN models. We also show that Glow likelihood-based OOD detection is breakable as well.

One-sentence Summary: Neural network is easily fooled by OOD samples due to non-bijective mapping caused by dimensionality reduction: a new method to generate OOD samples.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Reviewed Version (pdf): https://openreview.net/references/pdf?id=mpmJoWqjAM

31 Replies

Loading