TL;DR: We present UniDB, a unified diffusion bridge framework using stochastic optimal control, significantly improving detail preservation and image quality in generative tasks with minimal code modifications.
Abstract: Recent advances in diffusion bridge models leverage Doob’s $h$-transform to establish fixed endpoints between distributions, demonstrating promising results in image translation and restoration tasks. However, these approaches frequently produce blurred or excessively smoothed image details and lack a comprehensive theoretical foundation to explain these shortcomings. To address these limitations, we propose UniDB, a unified framework for diffusion bridges based on Stochastic Optimal Control (SOC). UniDB formulates the problem through an SOC-based optimization and derives a closed-form solution for the optimal controller, thereby unifying and generalizing existing diffusion bridge models. We demonstrate that existing diffusion bridges employing Doob’s $h$-transform constitute a special case of our framework, emerging when the terminal penalty coefficient in the SOC cost function tends to infinity. By incorporating a tunable terminal penalty coefficient, UniDB achieves an optimal balance between control costs and terminal penalties, substantially improving detail preservation and output quality. Notably, UniDB seamlessly integrates with existing diffusion bridge models, requiring only minimal code modifications. Extensive experiments across diverse image restoration tasks validate the superiority and adaptability of the proposed framework. Our code is available at https://github.com/UniDB-SOC/UniDB/.
Lay Summary: Diffusion bridges are advanced models used for tasks like image restoration and generation, leveraging diffusion processes to transition between two arbitrary distributions. A key technique in this area is Doob's $h$-transform, which modifies the diffusion process to ensure it reaches a specific endpoint. However, while effective, Doob's $h$-transform frequently produces blurred or excessively smoothed image details and lacks a comprehensive theoretical foundation to explain these shortcomings.
UniDB, our proposed framework, addresses this issue by formulating diffusion bridges with Stochastic Optimal Control (SOC) and deriving the closed-form solution for this SOC problem. Unlike Doob's $h$-transform, which focuses solely on endpoint accuracy, UniDB balances control costs and terminal penalties through a tunable parameter. This balance allows UniDB to produce images with better detail preservation and perceptual quality, avoiding the blurring artifacts often seen with Doob's $h$-transform.
UniDB's strength lies in its flexibility and adaptability. Notably, UniDB seamlessly integrates with existing diffusion bridge models, which means UniDB achieves superior performance with minimal code modifications to existing models, making it a powerful and practical solution for enhancing diffusion-based image restoration and generation tasks.
Link To Code: https://github.com/UniDB-SOC/UniDB/
Primary Area: Deep Learning->Generative Models and Autoencoders
Keywords: Diffusion bridge, Doob's h-transform, Stochastic optimal control
Submission Number: 2519
Loading