Transport Score Climbing: Variational Inference Using Forward KL and Adaptive Neural Transport

Liyi Zhang; David Blei; Christian A Naesseth

Transport Score Climbing: Variational Inference Using Forward KL and Adaptive Neural Transport

Liyi Zhang, David Blei, Christian A Naesseth

Published: 08 Aug 2023, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Variational inference often minimizes the ``reverse'' Kullbeck-Leibler (KL) $D_{KL}(q||p)$ from the approximate distribution $q$ to the posterior $p$. Recent work studies the ``forward'' KL $D_{KL}(p||q)$, which unlike reverse KL does not lead to variational approximations that underestimate uncertainty. Markov chain Monte Carlo (MCMC) methods were used to evaluate the expectation in computing the forward KL. This paper introduces Transport Score Climbing (TSC), a method that optimizes $D_{KL}(p||q)$ by using Hamiltonian Monte Carlo (HMC) but running the HMC chain on a transformed, or warped, space. A function called the transport map performs the transformation by acting as a change-of-variable from the latent variable space. TSC uses HMC samples to dynamically train the transport map while optimizing $D_{KL}(p||q)$. TSC leverages synergies, where better transport maps lead to better HMC sampling, which then leads to better transport maps. We demonstrate TSC on synthetic and real data, including using TSC to train variational auto-encoders. We find that TSC achieves competitive performance on the experiments.

Submission Length: Regular submission (no more than 12 pages of main content)

Previous TMLR Submission Url: https://openreview.net/forum?id=zfBW39xZ2E

Changes Since Last Submission: Here are the changes since the last TMLR submission: * Removed the proof on convergence that is mainly an application of Ou & Song (2020), and instead edited motivation by emphasizing that the main contribution is methodology development and empirical evaluations. The motivations are edited in section 1 (Introduction) and section 4 (Empirical Evaluation) * Extended empirical results with evaluations on VAE metrics, and added a benchmark in section 4.1. * Various edits in the text. References: Zhijian Ou and Yunfu Song. Joint stochastic approximation and its application to learning discrete latent variable models. In Conference on Uncertainty in Artificial Intelligence, 2020.

Code: https://github.com/zhang-liyi/tsc

Supplementary Material: zip

Assigned Action Editor: ~Michal_Valko1

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Submission Number: 1118

Loading