Keywords: MRF, latent variable, Bethe, UGM, approximate inference, deep generative model
TL;DR: Learning deep latent variable MRFs with a saddle-point objective derived from the Bethe partition function approximation.
Abstract: While much recent work has targeted learning deep discrete latent variable models with variational inference, this setting remains challenging, and it is often necessary to make use of potentially high-variance gradient estimators in optimizing the ELBO. As an alternative, we propose to optimize a non-ELBO objective derived from the Bethe free energy approximation to an MRF's partition function. This objective gives rise to a saddle-point learning problem, which we train inference networks to approximately optimize. The derived objective requires no sampling, and can be efficiently computed for many MRFs of interest. We evaluate the proposed approach in learning high-order neural HMMs on text, and find that it often outperforms other approximate inference schemes in terms of true held-out log likelihood. At the same time, we find that all the approximate inference-based approaches to learning high-order neural HMMs we consider underperform learning with exact inference by a significant margin.