Entropy-MCMC: Sampling from Flat Basins with Ease

Published: 29 Nov 2023, Last Modified: 29 Nov 2023NeurReps 2023 PosterEveryoneRevisionsBibTeX
Submission Track: Extended Abstract
Keywords: Flatness-aware Learning, Bayesian Deep Learning, MCMC
TL;DR: We propose a practical MCMC algorithm to sample from the flat basins of deep neural network posteriors.
Abstract: Bayesian deep learning counts on the quality of posterior distribution estimation. However, the posterior of deep neural networks is highly multi-modal in nature, with local modes exhibiting varying generalization performances. Given a practical budget, sampling from the original posterior can lead to suboptimal performances, as some samples may become trapped in "bad" modes and suffer from overfitting. Leveraging the observation that "good" modes with low generalization error often reside in flat basins of the energy landscape, we propose to bias the sampling on the posterior toward these flat regions. Specifically, we introduce an auxiliary guiding variable, the stationary distribution of which resembles a smoothed posterior free from sharp modes, to lead the MCMC sampler to flat basins. We prove the convergence of our method and further show that it converges faster than several existing flatness-aware methods in the strongly convex setting. Empirical results demonstrate that our method can successfully sample from flat basins of the posterior, and outperforms all compared baselines on multiple benchmarks including classification, calibration and out-of-distribution detection.
Submission Number: 44
Loading