Dual Training of Energy-Based Models with Overparametrized Shallow Neural NetworksDownload PDF

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone
Keywords: energy-based models, generative modeling, neural networks, duality, Fenchel, maximum mean discrepancy, maximum likelihood, active, lazy, score matching, measure, feature
Abstract: Energy-based models (EBMs) are generative models that are usually trained via maximum likelihood estimation. This approach becomes challenging in generic situations where the trained energy is nonconvex, due to the need to sample the Gibbs distribution associated with this energy. Using general Fenchel duality results, we derive variational principles dual to maximum likelihood EBMs with shallow overparametrized neural network energies, both in the active (aka feature-learning) and lazy regimes. In the active regime, this dual formulation leads to a training algorithm in which one updates concurrently the particles in the sample space and the neurons in the parameter space of the energy at a faster rate. We also consider a variant of this algorithm in which the particles are sometimes restarted at random samples drawn from the data set, and show that performing these restarts at every iteration step corresponds to score matching training. Using intermediate parameter setups in our dual algorithm thereby gives a way to interpolate between maximum likelihood and score matching training. These results are illustrated in simple numerical experiments.
One-sentence Summary: We derive dual algorithms to train maximum likelihoods EBMs with shallow overparametrized neural network energies, incorporating score matching in our framework.
Supplementary Material: zip
6 Replies

Loading