Bayesian Deep Learning via Stochastic Gradient MCMC with a Stochastic Approximation Adaptation

Wei Deng, Xiao Zhang, Faming Liang, Guang Lin

Sep 27, 2018 ICLR 2019 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: We propose a robust Bayesian deep learning algorithm to infer complex posteriors with latent variables. Inspired by dropout, a popular tool for regularization and model ensemble, we assign sparse priors to the weights in deep neural networks (DNN) in order to achieve automatic “dropout” and avoid over-fitting. By alternatively sampling from posterior distribution through stochastic gradient Markov Chain Monte Carlo (SG-MCMC) and optimizing latent variables via stochastic approximation (SA), the trajectory of the target weights is proved to converge to the true posterior distribution conditioned on optimal latent variables. This ensures a stronger regularization on the over-fitted parameter space and more accurate uncertainty quantification on the decisive variables. Simulations from large-p-small-n regressions showcase the robustness of this method when applied to models with latent variables. Additionally, its application on the convolutional neural networks (CNN) leads to state-of-the-art performance on MNIST and Fashion MNIST datasets and improved resistance to adversarial attacks.
  • Keywords: generalized stochastic approximation, stochastic gradient Markov chain Monte Carlo, adaptive algorithm, EM algorithm, convolutional neural networks, Bayesian inference, sparse prior, spike and slab prior, local trap
  • TL;DR: a robust Bayesian deep learning algorithm to infer complex posteriors with latent variables
0 Replies

Loading