Keywords: Diffusion Models, Generative Models, Training-Free Acceleration
TL;DR: Accelerating Denoising Generative Models is as Easy as Predicting Second-Order Difference
Abstract: High-fidelity diffusion and flow models remain latency-bound at inference, motivating acceleration that leaves pretrained weights untouched. We ask: what is the $\\textit{minimal yet principled}$ way to accelerate sampling? Under a simple and mild budget, when uniform reduction targets more than $2\\times$ speedup, each three-step window contains at most one fresh denoiser call, creating a structural scarcity of real signals.
From this constraint, we isolate the $\\textit{observed}$ information at step $t$—the fresh output $\\psi_t$ and its backward difference $\\Delta \\psi_{t}^{(1)}=\\psi_t-\\psi_{t+1}$—and show it induces a uniquely minimal, affine-exact second-order predictor $\\hat\\psi_{t-1}=2 \\psi_t- \\psi_{t+1}$.
We prove that, under this scarcity, the two-point second-order rule is the information-consistent optimum: it is BLUE among linear two-point estimators.
Naively chaining this predictor across consecutive steps destabilizes sampling by compounding approximation errors.
We resolve this by $\textit{reusing the observed tuple}$ in an interleaved zig–zag schedule that prevents back-to-back extrapolations and controls variance.
The resulting method, $\textbf{ZEUS}$, is a zero-overhead, backbone- and parameterization-agnostic plug-in requiring no retraining, no feature caches, and no architectural changes.
Across images and video, ZEUS consistently moves the speed–fidelity Pareto frontier outward versus recent state-of-the-art, delivering up to $3.2\\times$ end-to-end speedup while improving perceptual similarity.
Primary Area: generative models
Submission Number: 19957
Loading