- Keywords: Latent variable models, upper bound on marginal likelihood, chi-square divergence
- TL;DR: An empirical study of variational inference based on chi-square divergence minimization, showing that minimizing the CUBO is trickier than maximizing the ELBO
- Abstract: Variational inference based on chi-square divergence minimization (CHIVI) provides a way to approximate a model's posterior while obtaining an upper bound on the marginal likelihood. However, in practice CHIVI relies on Monte Carlo (MC) estimates of an upper bound objective that at modest sample sizes are not guaranteed to be true bounds on the marginal likelihood. This paper provides an empirical study of CHIVI performance on a series of synthetic inference tasks. We show that CHIVI is far more sensitive to initialization than classic VI based on KL minimization, often needs a very large number of samples (over a million), and may not be a reliable upper bound. We also suggest possible ways to detect and alleviate some of these pathologies, including diagnostic bounds and initialization strategies.