A First-Order Method for Estimating Natural Gradients for Variational Inference with Gaussians and Gaussian Mixture Models

Oleg Arenz; Zihan Ye; Philipp Dahlinger; Gerhard Neumann

A First-Order Method for Estimating Natural Gradients for Variational Inference with Gaussians and Gaussian Mixture Models

Oleg Arenz, Zihan Ye, Philipp Dahlinger, Gerhard Neumann

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Variational Inference, Approximate Inference, MORE, VOGN, VIPS, GVA

Abstract: Variational inference with full-covariance Gaussian approximations is an important line of research, as such Gaussian variational approximations (GVAs) allow for tractable approximate inference while yielding superior approximations compared to mean-field methods. Moreover, it was recently shown, that the problem of variational inference with Gaussian mixture models can be reduced to Gaussian variational inference using VIPS, which is a procedure similar to expectation maximization. Effective approaches for Gaussian variational inference are MORE, VOGN, and VON, which are zero-order, first-order, and second-order, respectively. We focus on the first-order setting, which is arguably the most relevant for variational inference, and show that the biases added by the generalized Gauß-Newton approximation, which is applied by VOGN, can seriously affect the quality of the learned approximation. Hence, we propose gMORE, a method that is similar to MORE but differs by incorporating gradient information. GradientMORE achieves unbiased high-quality approximations of the Hessian that are similar to VON which has direct access to the Hessian. Our algorithm converges even in settings where VOGN does not converge. Compared to MORE, the additional information improves sample efficiency by about an order of magnitude. Furthermore, we evaluate the different approaches in the GMM setting by modifying VIPS, which has previously only been tested in combination with MORE, and show that the results from the GVA setting are transferable to GMMs, setting a new standard for GMM-based variational inference.

One-sentence Summary: We propose a novel method for learning Gaussian variational approximations, and test it for learning Gaussian mixture models for variational inference.

Supplementary Material: zip

4 Replies

Loading