Contrastive Gradient Guidance for Test-time Preference Alignment of Diffusion Models

Contrastive Gradient Guidance for Test-time Preference Alignment of Diffusion Models

ICLR 2026 Conference Submission22280 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Diffusion models, Text-to-image generation, Test-time alignment, Preference alignment

Abstract: Pre-trained diffusion models demonstrate remarkable performance in text-to-image generation, with current research efforts directed toward aligning them with human preferences across diverse application scenarios. Existing approaches often rely on costly pipelines that require collecting preference data, training reward models, and fine-tuning. A promising alternative is test-time alignment, which steers diffusion models during sampling without retraining. However, current test-time alignment methods typically depend on explicit reward models to provide a guidance signal for modifying a sampling path. These involve decoding a noisy image and estimating its rewards, which adds extra steps with computational overhead and might limit flexibility across diverse scenarios. We propose Contrastive Gradient Guidance (CGG), a conceptually straightforward and practical framework for test-time alignment that avoids explicit reward models by design. CGG derives its guidance signal from the contrastive difference between two diffusion models, parameterized through the gradient of the log-likelihood ratio of the favored and the unfavored distributions. The guidance signal steers a pre-trained diffusion model along its sampling path while implicitly aligning generation with human preferences. Experiments demonstrate that CGG consistently improves preference alignment in text-to-image generation and flexibly adapts to safety-critical and multi-preference scenarios. Moreover, CGG can be combined with prevailing test-time alignment techniques to yield additional gains. These results establish CGG as a principled framework for advancing test-time alignment of diffusion models.

Primary Area: generative models

Submission Number: 22280

Loading