Markov Chain Score Ascent: A Unifying Framework of Variational Inference with Markovian Gradients

Kyurae Kim; Jisu Oh; Jacob R. Gardner; Adji Bousso Dieng; Hongseok Kim

Markov Chain Score Ascent: A Unifying Framework of Variational Inference with Markovian Gradients

Kyurae Kim, Jisu Oh, Jacob R. Gardner, Adji Bousso Dieng, Hongseok Kim

Published: 31 Oct 2022, Last Modified: 11 Jan 2023NeurIPS 2022 AcceptReaders: Everyone

Keywords: variational inference, Bayesian inference, inclusive Kullback-Leibler divergence, Markov chain gradient descent, Markov chain

TL;DR: We provide a uniyfing non-asymptotic analysis of recent variational inference methods based on Markovian gradients and propose an improved scheme.

Abstract: Minimizing the inclusive Kullback-Leibler (KL) divergence with stochastic gradient descent (SGD) is challenging since its gradient is defined as an integral over the posterior. Recently, multiple methods have been proposed to run SGD with biased gradient estimates obtained from a Markov chain. This paper provides the first non-asymptotic convergence analysis of these methods by establishing their mixing rate and gradient variance. To do this, we demonstrate that these methods—which we collectively refer to as Markov chain score ascent (MCSA) methods—can be cast as special cases of the Markov chain gradient descent framework. Furthermore, by leveraging this new understanding, we develop a novel MCSA scheme, parallel MCSA (pMCSA), that achieves a tighter bound on the gradient variance. We demonstrate that this improved theoretical result translates to superior empirical performance.

Supplementary Material: pdf

12 Replies

Loading