- Keywords: Fourier network, out-of-distribution detection, large initialization, uncertainty, ensembles
- TL;DR: Using sine activations and large weight initialization to mitigate some of the issues regular ensembles face on out-of-distribution detection tasks.
- Abstract: A simple method for obtaining uncertainty estimates for Neural Network classifiers (e.g. for out-of-distribution detection) is to use an ensemble of independently trained networks and average the softmax outputs. While this method works, its results are still very far from human performance on standard data sets. We investigate how this method works and observe three fundamental limitations: "Unreasonable" extrapolation, "unreasonable" agreement between the networks in an ensemble, and the filtering out of features that distinguish the training distribution from some out-of-distribution inputs, but do not contribute to the classification. To mitigate these problems we suggest "large" initializations in the first layers and changing the activation function to sin(x) in the last hidden layer. We show that this combines the out-of-distribution behavior from nearest neighbor methods with the generalization capabilities of neural networks, and achieves greatly improved out-of- distribution detection on standard data sets (MNIST/fashionMNIST/notMNIST, SVHN/CIFAR10).