SplitMeanFlow: Interval Splitting Consistency in Few-Step Generative Modeling

SplitMeanFlow: Interval Splitting Consistency in Few-Step Generative Modeling

ICLR 2026 Conference Submission13368 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: text-to-speech, flow matching, meanflow, efficiency, speed-quality tradeoff

TL;DR: We propose SplitMeanFlow, a framework that accelerates speech synthesis by learning average velocity and enabling one-step generation without sacrificing quality.

Abstract: Flow Matching has achieved prominent performance in generative modeling, yet it is plagued by high computational costs due to iterative sampling. Recent approaches such as MeanFlow address this bottleneck by learning average velocity fields instead of instantaneous velocities. However, we demonstrate that MeanFlow’s differential formulation is a special case of a more fundamental principle. In this work, we revisit the first principles of average velocity fields and derive a key algebraic identity: Interval Splitting Consistency. Building on this, we propose SplitMeanFlow, a novel framework that directly enforces this algebraic consistency as a core learning objective. Theoretically, we show that SplitMeanFlow recovers MeanFlow’s differential identity in the limit, thereby establishing a more general and robust basis for average velocity field learning. Practically, SplitMeanFlow simplifies training by eliminating the need for JVP and enables one-step synthesis. Extensive experiments on large-scale speech synthesis tasks verify its superiority: SplitMeanFlow achieves a 10$\times$ speedup and a 20$\times$ reduction in computational cost, while preserving speech quality, delivering substantial efficiency gains without compromising generative performance.

Supplementary Material: zip

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 13368

Loading