UniGuide: Learning Guidance Policies for Multi-Objective Diffusion Sampling

Mahmoud Hegazy; Navid Bagheri Shouraki; Eric Moulines; Aymeric Dieuleveut; Michael I. Jordan; Yazid Janati

UniGuide: Learning Guidance Policies for Multi-Objective Diffusion Sampling

Mahmoud Hegazy, Navid Bagheri Shouraki, Eric Moulines, Aymeric Dieuleveut, Michael I. Jordan, Yazid Janati

Published: 02 Mar 2026, Last Modified: 02 Apr 2026ReALM-GEN 2026 - ICLR 2026 WorkshopEveryoneRevisionsCC BY 4.0

Keywords: Diffusion Models, Classifier-Free Guidance, Multi-Objective Reinforcement Learning

Abstract: Guidance is the primary mechanism for steering conditional diffusion models, yet modern samplers rely on coupled, step-dependent design choices that obscure trade-offs between alignment, realism, and diversity. We introduce UniGuide, a unified framework that formulates guided diffusion sampling as a sequential decision-making problem, where guidance actions are selected along the denoising trajectory based on the evolving model state. UniGuide employs a flexible parameterization assigning distinct guidance to the denoiser and the noise-prediction term. To address multi-objective trade-offs, we learn a low-dimensional, preference-conditioned policy that maps intermediate model states to per-step guidance parameters. This enables continuous traversal of a Pareto frontier at inference time via a preference vector, eliminating the need for retuning or manual schedule design. Experiments on conditional image generation demonstrate effective Pareto-optimal guidance with pretrained diffusion backbones.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 99

Loading