Abstract: Generative models based on dynamical equations -- such as diffusion models and probability flows -- offer exceptional sample quality but typically require computationally expensive numerical integration during inference. Recent advances in consistency models have enabled one-step or few-step generation with efficiency comparable to generative adversarial networks; however, despite their practical success, consistency models still lack a unified theoretical framework. Here, we introduce Flow Map Matching (FMM), a principled approach that learns the two-time flow map of an underlying dynamical equation rather than its instantaneous velocity field, thereby providing this missing mathematical foundation. FMM extends the consistency modeling paradigm by allowing practitioners to tune the number of inference steps on the fly, dynamically balancing computational cost and sample quality. By leveraging stochastic interpolants, we propose training objectives both for distillation from a pre-trained velocity field and for direct flow map learning over an interpolant or a forward diffusion process. Theoretically, we show that FMM unifies a range of existing consistency-based approaches, including standard consistency models, consistency trajectory models, and progressive distillation approaches. Experiments on CIFAR-10 and ImageNet 32$\times$32 show that our framework achieves comparable sample quality to flow matching models while reducing generation time by a factor of 10-20.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: - Clarification of theoretical background, contributions, and correctness
- More experimentation
Assigned Action Editor: ~Eduard_Gorbunov1
Submission Number: 3112
Loading