Bayesian Neural Network Priors RevisitedDownload PDF

Published: 09 Dec 2020, Last Modified: 22 Oct 2023ICBINB 2020 SpotlightReaders: Everyone
Keywords: Bayesian neural networks, priors, Gaussian
TL;DR: Contrary to common practice, isotropic Gaussian distributions are not generally the best choice of priors for Bayesian neural networks.
Abstract: Isotropic Gaussian priors are the de facto standard for modern Bayesian neural network inference. However, such simplistic priors are unlikely to either accurately reflect our true beliefs about the weight distributions, or to give optimal performance. We study summary statistics of (convolutional) neural network weights in networks trained using SGD. We find that in certain circumstances, these networks have heavy-tailed weight distributions, while convolutional neural network weights often display strong spatial correlations. Building these observations into the respective priors, we get improved performance on MNIST classification. Remarkably, we find that using a more accurate prior partially mitigates the cold posterior effect, by improving performance at high temperatures corresponding to exact Bayesian inference, while having less of an effect at small temperatures.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2102.06571/code)
1 Reply

Loading