Inference-Time Diffusion Model Alignment via Random Ordinary Equations

05 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Diffusion Models, Inference-Time Scaling, Diffusion Model Alignment, Random Ordinary Equations
TL;DR: We scale low-dimensional actions with random ordinary equations to achieve inference-time diffusion model alignment.
Abstract: Aligning diffusion models (DM) with human preferences is a challenging yet practical task. Recent efforts focus on training-free methods, but usually adopt high-dimensional action spaces or require differentiable rewards. To address these issues, we propose a novel inference-time alignment framework based on random ordinary differential equation sampling. Specifically, we first formulate DM alignment as a max-encountered-reward optimal control problem. Then, by fixing the process noise and optimizing the perturbation strength, we obtain a 1-D action space, which integrates naturally with Monte Carlo tree search. We can thus perform trajectory search to derive the optimal control in a gradient-free manner, therefore supporting non-differentiable rewards. We also provide theoretical guarantees and empirical evidence to support and validate our method. Experiments show that our method demonstrates sufficient sample diversity and successfully aligns pre-trained DMs with reward functions defined on clean image domains.Our method outperforms traditional inference-step scaling, achieving higher best rewards. Meanwhile, it has significantly higher parameter efficiency than existing approaches adopting high-dimensional action spaces. Our approach can be plug-and-play integrated into any multi-step inference DMs.
Supplementary Material: zip
Primary Area: generative models
Submission Number: 2419
Loading