- Keywords: variational inference, kullback-leibler, divergence, importance sampling, boosting, adaptive importance sampling, variational boosting
- TL;DR: We substitute reverse KL (RKL) with forward KL in variational inference/boosting to alleviate the light-tail and mode-seeking weaknesses of RKL and to theoretically guarantee convergence to the optimal importance sampling (IS) proposal distribution.
- Abstract: Variational Inference (VI) is a popular alternative to asymptotically exact sampling in Bayesian inference. Its main workhorse is optimization over a reverse Kullback-Leibler divergence (RKL), which typically underestimates the tail of the posterior and causes miscalibration and potential degeneracy (over-pruning). Importance sampling (IS), on the other hand, is often used to fine-tune and debias the estimates of approximate Bayesian inference procedures. The quality of IS crucially depends on the choice of the proposal distribution. Ideally, the proposal distribution has heavier tails than the target, which is unachievable by minimizing the RKL. We thus propose a novel combination of optimization and sampling techniques for approximate Bayesian inference by constructing an IS proposal distribution through the minimization of a forward KL (FKL) divergence. This approach guarantees asymptotic consistency and a fast convergence towards both the optimal IS estimator and the optimal variational approximation.