Abstract: Consistency models imitate the multi-step sampling of score-based diffusion in a single forward pass of a neural network.
They can be learned in two ways: consistency distillation and consistency training. The former relies on the true velocity field of the corresponding differential equation, approximated by a pre-trained neural network.
In contrast, the latter uses a single-sample Monte Carlo estimate of this velocity field.
The related estimation error induces a discrepancy between consistency distillation and training that, we show, still holds in the continuous-time limit.
To alleviate this issue, we propose a novel flow that transports noisy data towards their corresponding outputs derived from a consistency model.
We prove that this flow reduces the previously identified discrepancy and the noise-data transport cost.
Consequently, our method not only accelerates consistency training convergence but also enhances its overall performance. The code is available at https://github.com/thibautissenhuth/consistency_GC.
Lay Summary: Most image generation models work by gradually turning random noise into a clear image, a process that can be slow and resource-intensive. To speed this up, methods like Consistency Models (CMs) have been developed. These neural network-based models can generate images in just one step instead of many. There are two ways to train CMs: (i) by imitating an already trained diffusion model, or (ii) by training from scratch without using such a pre-trained model. The second method is attractive because it doesn’t require an existing model.
The question we address is whether these two training methods are equivalent, and we provide a negative answer. Indeed, when a CM is trained from scratch, we prove mathematically that an extra term affects the model, making it different from the first method. To alleviate the effect of this term, we introduce a simple solution called Generator-Augmented Flows. This method feeds the model’s own predictions back into its training process.
As a result, Generator-Augmented Flows help the model learn faster while generating better images. These findings show how important it is to design training methods that reduce randomness during the training of CMs.
Link To Code: https://github.com/thibautissenhuth/consistency_GC
Primary Area: Deep Learning->Generative Models and Autoencoders
Keywords: Consistency Models
Submission Number: 9841
Loading