Rethinking the Training of Diffusion Bridge Samplers: Losses and Exploration

Sebastian Sanokowski; Christoph Bartmann; Lukas Gruber; Sepp Hochreiter; Sebastian Lehner

Rethinking the Training of Diffusion Bridge Samplers: Losses and Exploration

Sebastian Sanokowski, Christoph Bartmann, Lukas Gruber, Sepp Hochreiter, Sebastian Lehner

Published: 06 Mar 2025, Last Modified: 24 Apr 2025FPI-ICLR2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Diffusion Samplers, Diffusion Bridges, Bayesian Learning, Diffusion Models

TL;DR: We show that the theoretical rationale for the LV loss is questionable in diffusion bridges, advocating reverse Kullback-Leibler divergence with the log derivative trick. Additionally, we introduce exploration techniques to alleviate mode-collapse.

Abstract: Diffusion bridges are a promising class of deep-learning methods for sampling from unnormalized distributions. Recent works show that the Log Variance (LV) loss consistently outperforms the reverse Kullback-Leibler (rKL) loss when using the reparametrization trick to compute rKL-gradients. While the LV loss is theoretically justified for diffusion samplers with non-learnable forward processes—yielding identical gradients to the rKL loss combined with the log derivative trick—this equivalence does not hold for diffusion bridges. We point out that the LV loss does not unconditionally satisfy the data processing inequality, casting doubt on its suitability for diffusion bridges. To avoid this problem we employ the rKL loss with the log derivative trick and show that it consistently outperforms the LV loss. Furthermore, we introduce two techniques for controlling the exploration-exploitation trade-off in diffusion samplers—one based on variational annealing and the other on off-policy exploration. We validate their effectiveness on highly multimodal benchmark tasks.

Submission Number: 62

Loading