A unified perspective on fine-tuning and sampling with diffusion and flow models

A unified perspective on fine-tuning and sampling with diffusion and flow models

ICLR 2026 Conference Submission21192 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: diffusion models, flow matching, sampling, reward fine-tuning, neural sampler, stochastic optimal control, thermodynamics, Jarzynski identity, Crooks fluctuation theorem

TL;DR: We unify SOC- and thermodynamics-based approaches to fine-tuning and sampling with diffusion and flow models

Abstract: We study the problem of training diffusion and flow generative models to sample from target distributions defined by an exponential tilting of the base density. This tasks subsumes sampling from unnormalized densities and reward fine-tuning a pre-trained model, and can be approached from a stochastic optimal control (SOC) perspective and from a thermodynamics perspective. The SOC formulation has been tackled using adjoint-based methods (Adjoint Matching and Sampling), and score matching methods, while the thermodynamics formulation has given rise to algorithms such as CMCD and NETS. Our contributions include bounding the lean adjoint ODE underlying Adjoint Matching and Sampling, deriving bias–variance decompositions that allow a principled comparison between adjoint-based and score-matching methods, adapting thermodynamic formulations to the exponential tilting setting, and text-to-image fine-tuning experiments.

Primary Area: generative models

Submission Number: 21192

Loading