Keywords: Diffusion Models, Model Acceleration, Adaptive Momentum
TL;DR: We propose FeMo, a momentum-based acceleration framework for diffusion models that predicts future features from historical steps, and Adapted-FeMo, which achieving up to 7.1× speedup without sacrificing generation quality.
Abstract: Diffusion models have demonstrated outstanding generative capabilities in image and video synthesis. However, their heavy computational burden, particularly due to the sequential denoising process and large model sizes, makes them challenging to meet real-time application demands. In this paper, motivated by the continuity of diffusion models in the feature space, we introduce FeMo, which employs a momentum mechanism to stabilize the dynamics of diffusion models in different timesteps, allowing us to accurately predict the features in the future timesteps based on the historical information. Additionally, we further propose an Adapted-FeMo, which allows for adaptive searching for the optimal coefficient for each generated sample. Extensive experiments demonstrate its effectiveness, e.g., a 4.99$\times$ acceleration on FLUX with 0.86% improvements on image reward.Under the condition of maintaining generation quality, Adapted-FeMo achieves a maximum speedup of 7.10$\times$ on DiT and 6.24$\times$ on FLUX. Our codes are available in the supplementary material and will be released on Github.
Supplementary Material: zip
Primary Area: generative models
Submission Number: 13027
Loading