UniGuide: Learning Guidance Policies for Multi-Objective Diffusion Sampling

Published: 02 Mar 2026, Last Modified: 02 Mar 2026ReALM-GEN 2026 - ICLR 2026 WorkshopEveryoneRevisionsCC BY 4.0
Keywords: Diffusion Models, Classifier-Free Guidance, Multi-Objective Reinforcement Learning
Abstract: Guidance is the primary mechanism for steering conditional diffusion models, yet modern samplers rely on coupled, step-dependent design choices that obscure trade-offs between alignment, realism, and diversity. We introduce UniGuide, a unified framework that formulates guided diffusion sampling as a sequential decision-making problem, where guidance actions are selected along the denoising trajectory based on the evolving model state. UniGuide employs a flexible parameterization assigning distinct guidance to the denoiser and the noise-prediction term. To address multi-objective trade-offs, we learn a low-dimensional, preference-conditioned policy that maps intermediate model states to per-step guidance parameters. This enables continuous traversal of a Pareto frontier at inference time via a preference vector, eliminating the need for retuning or manual schedule design. Experiments on conditional image generation demonstrate effective Pareto-optimal guidance with pretrained diffusion backbones.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 99
Loading