Neural Network Regression with Beta, Dirichlet, and Dirichlet-Multinomial Outputs

Peter Sadowski; Pierre Baldi

Neural Network Regression with Beta, Dirichlet, and Dirichlet-Multinomial Outputs

Peter Sadowski, Pierre Baldi

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: We propose a method for quantifying uncertainty in neural network regression models when the targets are real values on a $d$-dimensional simplex, such as probabilities. We show that each target can be modeled as a sample from a Dirichlet distribution, where the parameters of the Dirichlet are provided by the output of a neural network, and that the combined model can be trained using the gradient of the data likelihood. This approach provides interpretable predictions in the form of multidimensional distributions, rather than point estimates, from which one can obtain confidence intervals or quantify risk in decision making. Furthermore, we show that the same approach can be used to model targets in the form of empirical counts as samples from the Dirichlet-multinomial compound distribution. In experiments, we verify that our approach provides these benefits without harming the performance of the point estimate predictions on two diverse applications: (1) distilling deep convolutional networks trained on CIFAR-100, and (2) predicting the location of particle collisions in the XENON1T Dark Matter detector.

Keywords: regression, uncertainty, deep learning

TL;DR: Neural network regression should use Dirichlet output distribution when targets are probabilities in order to quantify uncertainty of predictions.

Data: [CIFAR-100](https://paperswithcode.com/dataset/cifar-100)

6 Replies

Loading