Bayesian Neural Network Priors RevisitedDownload PDF

Published: 21 Dec 2020, Last Modified: 12 Mar 2024AABI2020Readers: Everyone
Keywords: Bayesian neural networks, priors, Gaussian
TL;DR: We show that heavy-tailed and correlated priors can improve the performance of Bayesian neural networks and reduce the cold posterior effect.
Abstract: Isotropic Gaussian priors are the de facto standard for modern Bayesian neural network inference. However, such simplistic priors are unlikely to either accurately reflect our true beliefs about the weight distributions, or to give optimal performance. We study summary statistics of (convolutional) neural network weights in networks trained using SGD. We find that in certain circumstances, these networks have heavy-tailed weight distributions, while convolutional neural network weights often display strong spatial correlations. Building these observations into the respective priors, we get improved performance on MNIST classification. Remarkably, we find that using a more accurate prior partially mitigates the cold posterior effect, by improving performance at high temperatures corresponding to exact Bayesian inference, while having less of an effect at small temperatures.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2102.06571/code)
1 Reply

Loading