Diffusion Tree Sampling: Scalable inference‑time alignment of diffusion models

Vineet Jain; Kusha Sareen; Mohammad Pedramfar; Siamak Ravanbakhsh

Diffusion Tree Sampling: Scalable inference‑time alignment of diffusion models

Vineet Jain, Kusha Sareen, Mohammad Pedramfar, Siamak Ravanbakhsh

Published: 10 Jun 2025, Last Modified: 13 Jul 2025PUT at ICML 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Diffusion models, inference-time alignment, Monte Carlo Tree Search, test-time sampling and search, generative models

TL;DR: We build a Monte Carlo Tree over the diffusion denoising process that can be used for scalable and compute-efficient inference‑time alignment of pretrained diffusion models to new reward functions

Abstract: Adapting a pretrained diffusion model to new objectives at inference time remains an open problem in generative modeling. Existing steering methods suffer from inaccurate value estimation, especially at high noise levels, which biases guidance. Moreover, information from past runs is not reused to improve sample quality, leading to inefficient use of compute. Inspired by the success of Monte Carlo Tree Search, we address these limitations by casting inference-time alignment as a search problem that reuses past computations. We introduce a tree-based approach that _samples_ from the reward-aligned target density by propagating terminal rewards back through the diffusion chain and iteratively refining value estimates with each additional generation. Our proposed method, Diffusion Tree Sampling (DTS), produces asymptotically exact samples from the target distribution in the limit of infinite rollouts, and its greedy variant Diffusion Tree Search (DTS*) performs a robust search for high reward samples. On MNIST and CIFAR-10 class-conditional generation, DTS matches the FID of the best-performing baseline with up to $5\times$ less compute. In text-to-image generation and language completion tasks, DTS* effectively searches for high reward samples that match best-of-N with $2\times$ less compute. By reusing information from previous generations, we get an _anytime algorithm_ that turns additional compute budget into steadily better samples, providing a scalable approach for inference-time alignment of diffusion models.

Submission Number: 49

Loading