No MCMC Teaching For me: Learning Energy-Based Models via Diffusion Synergy

Shanchao Yang; WU Yanrui; Yidong Ouyang; Baoxiang Wang; Hongyuan Zha

No MCMC Teaching For me: Learning Energy-Based Models via Diffusion Synergy

Shanchao Yang, WU Yanrui, Yidong Ouyang, Baoxiang Wang, Hongyuan Zha

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: energy-based models, generative modeling, sampling, diffusion models

TL;DR: We propose an innovative MCMC teaching-free framework that jointly trains Energy-Based Models and diffusion-based generative models, significantly enhancing training efficiency and accuracy by eliminating the reliance on biased MCMC samples.

Abstract: Markov chain Monte Carlo (MCMC) sampling-based maximum likelihood estimation is a standard approach for training Energy-Based Models (EBMs). However, its effectiveness and training stability in high-dimensional settings remain thorny issues due to challenges like mode collapse and slow mixing of MCMC. To address these limitations, we introduce a novel MCMC teaching-free learning framework that jointly trains an EBM and a diffusion-based generative model, leveraging the variational formulation of divergence between time-reversed diffusion paths. In each iteration, the generator model is trained to align with both the empirical data distribution and the current EBM, bypassing the need for biased MCMC sampling. The EBM is then updated by maximizing the likelihood of the synthesized examples generated through a diffusion generative process that more accurately reflects the EBM’s distribution. Moreover, we propose a novel objective function that further improves EBM learning by minimizing the discrepancy between the EBM and the generative model. Our proposed approach enhances training efficiency and overcomes key challenges associated with traditional MCMC-based methods. Experimental results on generative modeling and likelihood estimation demonstrate the superior performance of our method.

Primary Area: generative models

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6908

Loading