STOCHASTIC GRADIENT LANGEVIN DYNAMICS THAT EXPLOIT NEURAL NETWORK  STRUCTURE

Zachary Nado; Jasper Snoek; Roger Grosse; David Duvenaud; Bowen Xu; James Martens

STOCHASTIC GRADIENT LANGEVIN DYNAMICS THAT EXPLOIT NEURAL NETWORK STRUCTURE

Zachary Nado, Jasper Snoek, Roger Grosse, David Duvenaud, Bowen Xu, James Martens

12 Feb 2018 (modified: 05 May 2023)ICLR 2018 Workshop SubmissionReaders: Everyone

Abstract: Tractable approximate Bayesian inference for deep neural networks remains challenging. Stochastic Gradient Langevin Dynamics (SGLD) offers a tractable approximation to the gold standard of Hamiltonian Monte Carlo. We improve on existing methods for SGLD by incorporating a recently-developed tractable approximation of the Fisher information, known as K-FAC, as a preconditioner.

TL;DR: We use a recent approximation for the Fisher information to improve approximate Bayesian inference for deep neural networks with Langevin Dynamics.

Keywords: monte carlo, Bayesian deep networks

4 Replies

Loading