FlowCast: Trajectory Forecasting for Scalable Zero-Cost Speculative Flow Matching

ICLR 2026 Conference Submission13489 Authors

Published: 26 Jan 2026, Last Modified: 26 Jan 2026ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Flow Matching, Speculative Decoding, Inference Acceleration, Training-Free, Generative Models, Zero-Cost Drafts, Parallel Verification, Adaptive Sampling
TL;DR: FlowCast accelerates Flow Matching inference via training-free speculative generation, using constant-velocity forecasting to skip redundant steps, achieving >2.5× speedup without quality loss.
Abstract: Flow Matching (FM) has recently emerged as a powerful approach for high-quality visual generation. However, their prohibitively slow inference due to a large number of denoising steps limits their potential use in real-time or interactive applications. Existing acceleration methods, like distillation, truncation, or consistency training, either degrade quality, incur costly retraining, or lack generalization. We propose FlowCast, a training-free speculative generation framework that accelerates inference by exploiting the fact that FM models are trained to preserve constant velocity. FlowCast speculates future velocity by extrapolating current velocity without incurring additional cost, and accepts it if it is within a mean-squared error threshold. This constant-velocity forecasting allows redundant steps in stable regions to be aggressively skipped while retaining precision in complex ones. FlowCast is a plug-and-play framework that integrates seamlessly with any FM model and requires no auxiliary networks. We also present a theoretical analysis and bound the worst-case deviation between speculative and full FM trajectories. Empirical evaluations demonstrate that FlowCast achieves $>2.5\times$ speedup in image generation, video generation, and editing tasks, outperforming existing baselines with no quality loss as compared to standard full generation.
Primary Area: generative models
Submission Number: 13489
Loading