Keywords: Energy-Based Models, Diffusion Models, Few-Step Generation, Likelihood-based Training, Adversarial Training
TL;DR: A significant improvement for few-step generation in Energy-Based Model.
Abstract: Energy-Based Models (EBMs) offer a principled framework for modeling complex data distributions. However, their training via contrastive divergence is often hindered by slow and unstable MCMC sampling—especially in high-dimensional settings. Recent advances such as diffusion recovery likelihood and cooperative diffusion recovery likelihood (CDRL) improve tractability by training conditional EBMs across multiple noise scales. These methods rely on Langevin sampling at each scale, translating to slow inference. In this work, we propose VDRME (Variational Diffusion Recovery with Multiscale Energy), a novel framework that amortizes MCMC sampling at each scale using conditional generators. We train conditional EBMs using a variational lower bound on the maximum likelihood objective, enabling efficient one-step sampling per scale. To further enhance diversity and prevent mode collapse, we introduce entropy-based regularization of the generators. Unlike diffusion GANs, which rely on adversarial losses and classifier guidance, VDRME maintains a fully energy-based formulation and produces informative energy priors for downstream tasks. Our experiments demonstrate that VDRME achieves fast, few-step generation with high perceptual quality, improved convergence, and strong performance on downstream tasks such as out-of-distribution detection and density estimation. By replacing score-based training with multiscale generators and avoiding traditional MCMC, VDRME offers a scalable, interpretable, and efficient alternative to existing EBMs and diffusion-based generative models.
Primary Area: generative models
Submission Number: 22682
Loading