Unified Continuous Generative Models for Denoising-based Diffusion

ICLR 2026 Conference Submission24943 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: generative modeling, denoising diffusion, consistency model, image generation
Abstract: Recent advances in continuous generative models, encompassing multi-step processes such as diffusion and flow matching (typically requiring $8$-$1000$ steps) and few-step methods such as consistency models (typically $1$-$8$ steps), have yielded impressive generative performance. However, existing work often treats these approaches as distinct paradigms, leading to disparate training and sampling methodologies. We propose a unified framework for the training, sampling, and analysis of diffusion, flow matching, and consistency models. Within this framework, we derive a surrogate unified objective that, for the first time, theoretically shows that the few-step objective can be viewed as the multi-step objective plus a regularization term. Building on this framework, we introduce the **U**nified **C**ontinuous **G**enerative **M**odels **T**rainer and **S**ampler (**UCGM**), which enables efficient and stable training of both multi-step and few-step models. Empirically, our framework achieves state-of-the-art results. On ImageNet $256\times256$ with a $675\text{M}$ diffusion transformer, UCGM-T trains a multi-step model achieving $1.30$ FID in $20$ steps, and a few-step model achieving $1.42$ FID in only $2$ steps. Moreover, applying UCGM-S to REPA-E improves its FID from $1.26$ (at $250$ steps) to $1.06$ in only $40$ steps, without additional cost.
Supplementary Material: zip
Primary Area: generative models
Submission Number: 24943
Loading