Shortcut Diffusion Training with Cumulative Consistency Loss: An Optimal Control View

Published: 26 Jan 2026, Last Modified: 26 Feb 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: One-Step Diffusion, Optimal Control, Shortcut Diffusion Models
Abstract: Although iterative denoising (i.e., diffusion/flow) methods offer strong generative performance, they suffer from low generation efficiency, requiring hundreds of steps of network forward passes to simulate a single sample. Mitigating this requires taking larger step-sizes during simulation, thereby allowing one- or few-step generation. Recently proposed shortcut model learns larger step-sizes by enforcing alignment between its direction and the path defined by a base many-step flow-matching model through a self-consistency loss. However, its generation quality is significantly lower than the base model. In this paper, we formulate few-step generation as a controlled base generative process, and show that self-consistency loss can be understood through the lens of optimal control. This perspective naturally motivates its generalization to the proposed cumulative self-consistency loss that cumulatively penalizes misalignment along the entire trajectory. This encourages larger step-sizes that not only align with the base model at the current time step but also support alignment in the subsequent steps, facilitating high-quality generation. Furthermore, we draw a connection between our approach and reinforcement learning, potentially opening the door to a new set of approaches for few-step generation. Experiments show that we significantly improve one- and few-step generation quality under the same training budget. Implementation is available at: [https://github.com/paribeshregmi/Shortcut-CSL](https://github.com/paribeshregmi/Shortcut-CSL)
Primary Area: generative models
Submission Number: 19781
Loading