TL;DR: We provide a new, unified view on learning unnormalized distributions via a family of NCE principles, and establish finite-sample rates for the wide class of estimators for exponential family distributions.
Abstract: This paper studies a family of estimators based on noise-contrastive estimation (NCE) for learning unnormalized distributions. The main contribution of this work is to provide a unified perspective on various methods for learning unnormalized distributions, which have been independently proposed and studied in separate research communities, through the lens of NCE. This unified view offers new insights into existing estimators. Specifically, for exponential families, we establish the finite-sample convergence rates of the proposed estimators under a set of regularity assumptions, most of which are new.
Lay Summary: Some of the most flexible and expressive probabilistic models in machine learning, such as those used to model natural images or simulate physical systems, are unnormalized probability distributions, commonly known as energy-based models. These models assign a score, or "energy", to each possible outcome; lower energy corresponds to higher probability. This makes them powerful tools for capturing complex patterns in data. However, there is a trade-off: these models do not provide exact probabilities because they avoid computing a costly "normalizing constant". In other words, this flexibility is a double-edged sword, as it enables rich modeling but makes learning from data more difficult. Across statistics, computational physics, and machine learning, various methods have been proposed to learn with such unnormalized models. In this work, we show that many of these seemingly different approaches can be understood through a common principle called noise-contrastive estimation (NCE), which learns by contrasting real data with random noise. Our framework connects ideas across disciplines, clarifies how these methods work, and offers a unified and systematic view of how quickly they can learn from data. We believe these insights bring us a step closer to making energy-based models more practical and widely usable.
Primary Area: Theory->Learning Theory
Keywords: unnormalized models, exponential family distributions, noise-contrastive estimation, interactive screening, finite-sample analysis
Submission Number: 1927
Loading