Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence

Yinbin Han; Meisam Razaviyayn; Renyuan Xu

Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence

Yinbin Han, Meisam Razaviyayn, Renyuan Xu

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Diffusion models have emerged as powerful tools for generative modeling, demonstrating exceptional capability in capturing target data distributions from large datasets. However, fine-tuning these massive models for specific downstream tasks, constraints, and human preferences remains a critical challenge. While recent advances have leveraged reinforcement learning algorithms to tackle this problem, much of the progress has been empirical, with limited theoretical understanding. To bridge this gap, we propose a stochastic control framework for fine-tuning diffusion models. Building on denoising diffusion probabilistic models as the pre-trained reference dynamics, our approach integrates linear dynamics control with Kullback–Leibler regularization. We establish the well-posedness and regularity of the stochastic control problem and develop a {policy iteration algorithm (PI-FT)} for numerical solution. We show that PI-FT achieves global convergence at a linear rate. Unlike existing work that assumes regularities throughout training, we prove that the control and value sequences generated by the algorithm preserve the desired regularity. Finally, we extend our framework to parametric settings for efficient implementation and demonstrate the practical effectiveness of the proposed PI-FT algorithm through numerical experiments.

Lay Summary: Modern AI systems can now generate realistic images, videos, and sounds using a family of models called diffusion models. These models are trained on huge datasets and can create impressive results. However, adapting them to perform well on specific tasks, follow human preferences, or respect certain constraints is still very difficult. Currently, many researchers use trial-and-error approaches or complex reinforcement learning techniques to fine-tune these models, but we still lack a clear theoretical understanding of how and why these methods work. In this paper, we introduce a new mathematical framework that treats fine-tuning as a stochastic control problem: we view the diffusion model as a system that can be guided toward better performance using carefully designed adjustments. We develop an efficient algorithm with proven guarantees that it will find the best way to adjust the model, with excellent computational efficiency. Our approach not only sheds light on how fine-tuning works but also provides practical tools to make diffusion models more flexible and useful in real-world applications.

Primary Area: Deep Learning->Theory

Keywords: Diffusion models, fine-tuning, stochastic control, reinforcement learning, policy iteration, optimality, regularity, convergence

Submission Number: 2266

Loading