Direction-Magnitude Decoupling for Fast Video Generation with Flow Matching Models

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Efficient Video Synthesis
Abstract: Flow matching models for video generation achieve impressive performance but suffer from high computational overhead due to iterative denoising. In fact, the original model is not necessary for all denoising steps, allowing some steps to use lightweight alternatives for faster processing. However, directly using caching or lightweight models can deviate from the original denoising trajectory, resulting in suboptimal performance. Through empirical analysis, we find that lightweight models can robustly capture the magnitude components of the original model's output, while caching provides reliable directional guidance. Building on this insight, we propose the Direction-Magnitude Decoupling (DMD) method, which adaptively employs a direction-calibrated lightweight model as a substitute for the original model to accelerate inference and effectively correct deviations in the denoising trajectory. Moreover, DMD further reduces inference costs by reusing magnitude information under classifier-free guidance (CFG). As a result, DMD offers a more reliable and lightweight solution to accelerate denoising. Experiments show that DMD outperforms existing acceleration methods, delivering significant speedups (e.g., up to 2.95× on Wan2.1) while maintaining visual fidelity.
Primary Area: generative models
Submission Number: 10266
Loading