Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
A Scalable Laplace Approximation for Neural Networks
Hippolyt Ritter, Aleksandar Botev, David Barber
Feb 15, 2018 (modified: Feb 23, 2018)ICLR 2018 Conference Blind Submissionreaders: everyoneShow Bibtex
Abstract:We leverage recent insights from second-order optimisation for neural networks to construct a Kronecker factored Laplace approximation to the posterior over the weights of a trained network. Our approximation requires no modification of the training procedure, enabling practitioners to estimate the uncertainty of their models currently used in production without having to retrain them. We extensively compare our method to using Dropout and a diagonal Laplace approximation for estimating the uncertainty of a network. We demonstrate that our Kronecker factored method leads to better uncertainty estimates on out-of-distribution data and is more robust to simple adversarial attacks. Our approach only requires calculating two square curvature factor matrices for each layer. Their size is equal to the respective square of the input and output size of the layer, making the method efficient both computationally and in terms of memory usage. We illustrate its scalability by applying it to a state-of-the-art convolutional network architecture.
TL;DR:We construct a Kronecker factored Laplace approximation for neural networks that leads to an efficient matrix normal distribution over the weights.
Keywords:deep learning, neural networks, laplace approximation, bayesian deep learning
Enter your feedback below and we'll get back to you as soon as possible.