Why Diffusion Models Are Stable and How to Make Them Faster: An Empirical Investigation and Optimization

22 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: diffusion model, curriculum learning, momentum decay, loss landscape
Abstract: Diffusion models, a potent generative framework, have garnered considerable attention in recent years. While many posit that the superiority of diffusion models stems from their stable training process compared to Generative Adversarial Networks (GANs), these assertions often rest on intuition and lack empirical substantiation. In this paper, we aim to provide direct evidence to explain why diffusion models exhibit remarkable stability during training. We start by conducting a consistency experiment, where we compare the generation results of models with different hyper-parameters, such as initialization and model structure, under the same sampling conditions. Our results show that diffusion models produce consistent generation results across different hyper-parameters, indicating that they are stable in learning the mapping between noise and data. We then compare the loss landscapes of diffusion models and GANs, and find that diffusion models have much smoother loss landscapes, implying better convergence stability. Based on these analyses, we propose two optimization methods for diffusion models, namely the curriculum learning based timestep schedule (CLTS) and the momentum decay with learning rate compensation (MDLRC), which optimize the sampling probability of timesteps and the momentum selection, respectively, to accelerate convergence. For example, on ImageNet128, our methods achieve a 2.6x speedup in training, demonstrating the effectiveness of our methods.
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4962
Loading