Keywords: Flow Matching, Stochastic Interpolants, Variance Reduction, Efficient Sampling
Abstract: We revisit flow matching from a variance-centric perspective.
Although conditional flow matching (CFM) is theoretically elegant, its use of single-sample conditional velocities introduces high variance, which can destabilize optimization and slow convergence.
We demonstrate that this behavior induces two distinct regimes: a high-variance regime that hinders training and a low-variance regime where conditional and true velocities are nearly identical, thereby enabling analytical sampling shortcuts.
Motivated by these insights, we introduce the \textbf{Stable Velocity} framework to improve both the training and sampling processes of flow matching.
For training, we propose \textit{Stable Velocity Matching (StableVM)}, a variance-reduced objective that preserves CFM's global optima while significantly improving stability and convergence in the high-variance regime.
For sampling, we introduce \textit{Stable Velocity Sampling (StableVS)}, a ``free lunch'' acceleration method that leverages the low-variance regime to achieve faster generation without requiring finetuning.
Experiments on SiT-XL trained on ImageNet, as well as on several large pretrained models (SD3, SD3.5, Flux, and Wan2.2), show consistent improvements in training convergence and more than $2\times$ faster sampling while maintaining high fidelity.
Primary Area: generative models
Submission Number: 3342
Loading