Keywords: Diffusion models, Conditional sampling, Reward alignment, Flow matching, Few-step generation, Model distillation, SDE, Flow Map
TL;DR: We turn conditional diffusion sampling into a deterministic conditional ODE with matching marginals and distil it into a stochastic few-step sampler for fast reward-aligned generation.
Abstract: Reward alignment for flow and diffusion models, via test-time steering or fine-tuning, often relies on sampling from conditional distributions $p_{\tau\mid t}(\space\cdot\mid x_t)$ induced by stochastic dynamics. In practice, this creates a computational bottleneck requiring expensive SDE simulation. Meanwhile, recent few-step accelerations for generative flows largely target deterministic dynamics and therefore do not directly address stochastic conditional sampling. We introduce stochastic few-step models, a framework for fast sampling from SDE-defined conditional distributions by mapping the conditional SDE to a deterministic ODE with matching marginals. Building on this formulation, we show that the resulting conditional ODEs can be effectively distilled into a single few-step model, enabling efficient conditional rollouts. Experiments on Gaussian-mixture and MNIST steering show that the resulting sampler provides accurate conditional samples to improve reward steering outperforming standard denoiser heuristics.
Submission Number: 143
Loading