Towards a mathematical theory for consistency training in diffusion models
Abstract: Consistency models, which were proposed to mitigate the high computational overhead during the
sampling phase of diffusion models, facilitate single-step sampling while attaining state-of-the-art empirical
performance. When integrated into the training phase, consistency models attempt to train a sequence
of consistency functions capable of mapping any point at any time step of the diffusion process to its
starting point. Despite the empirical success, a comprehensive theoretical understanding of consistency
training remains elusive. This paper takes a first step towards establishing theoretical underpinnings for
consistency models. We demonstrate that, in order to generate samples within $\varepsilon$ proximity to the target
in distribution (measured by some Wasserstein metric), it suffices for the number of steps in consistency
learning to exceed the order of $d^{5/2}/\varepsilon$, with $d$ the data dimension. Our theory offers rigorous insights into
the validity and efficacy of consistency models, illuminating their utility in downstream inference tasks.
Submission Number: 553
Loading