S4S: Solving for a Fast Diffusion Model Solver

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We learn the coefficients and time steps of diffusion model solvers to minimize global approximation error.
Abstract: Diffusion models (DMs) create samples from a data distribution by starting from random noise and iteratively solving a reverse-time ordinary differential equation (ODE). Because each step in the iterative solution requires an expensive neural function evaluation (NFE), there has been significant interest in approximately solving these diffusion ODEs with only a few NFEs without modifying the underlying model. However, in the few NFE regime, we observe that tracking the true ODE evolution is fundamentally impossible using traditional ODE solvers. In this work, we propose a new method that learns a good solver for the DM, which we call **S**olving **for** the **S**olver (**S4S**). S4S directly optimizes a solver to obtain good generation quality by learning to match the output of a strong teacher solver. We evaluate S4S on six different pre-trained DMs, including pixel-space and latent-space DMs for both conditional and unconditional sampling. In all settings, S4S uniformly improves the sample quality relative to traditional ODE solvers. Moreover, our method is lightweight, data-free, and can be plugged in black-box on top of any discretization schedule or architecture to improve performance. Building on top of this, we also propose **S4S-Alt**, which optimizes both the solver and the discretization schedule. By exploiting the full design space of DM solvers, with 5 NFEs, we achieve an FID of 3.73 on CIFAR10 and 13.26 on MS-COCO, representing a $1.5\times$ improvement over previous training-free ODE methods.
Lay Summary: Diffusion models can generate realistic images by starting with pure noise and gradually turning it into a meaningful image. This process is typically slow because it normally takes a lot of steps, and each step requires running a large model. People have been trying to speed this up by using fewer steps, but that usually hurts the quality of the final image. In this work, we introduce a new method called *S*olving *for* the *S*olver (S4S). Instead of relying on traditional ways to speed things up, S4S /learns/ how to make the process more efficient by mimicking what a slower, high-quality method does. We tested S4S on six different kinds of image generators, and in every case, it improved the quality of the generated images without needing extra data. We also introduce S4S-Alt, which interleaves S4S with another routine for learning better discretization of time. By exploiting the full design space of DM solvers, with just 5 steps, we achieve an FID of 3.73 on CIFAR10 and 13.26 on MS-COCO, representing a $1.5\times$ improvement over previous training-free ODE methods.
Primary Area: Deep Learning->Generative Models and Autoencoders
Keywords: Diffusion models, efficient sampling, ODE solvers
Submission Number: 13027
Loading