Keywords: density estimation, energy based model, contrastive estimation, bijections, parameter estimation
Abstract: In this work, we propose Bijective-Contrastive Estimation (BCE), a classification-based learning criterion for energy-based models. We generate a collection of contrasting distributions using bijections, and solve all the classification problems between the original data distribution and the distributions induced by the bijections using a classifier parameterized by an energy model. We show that if the classification objective is minimized, the energy function will uniquely recover the data density up to a normalizing constant. This has the benefit of not having to explicitly specify a contrasting distribution, like noise contrastive estimation. Experimentally, we demonstrate that the proposed method works well on 2D synthetic datasets. We discuss the difficulty in high dimensional cases, and propose potential directions to explore for future work.
TL;DR: We use bijections to generate contrasting distributions to learn an energy function of the data distribution.