Keywords: Diffusion models, classifier guidance, conditional image generation, entropy regularization, reverse-KL guidance, tilted loss
TL;DR: Regularize, not rescale: entropy + reverse-KL guidance prevent gradient vanishing in classifier-guided diffusion, improving FID without retraining.
Abstract: Classifier-guided diffusion models have emerged as a powerful approach for conditional image generation, but they suffer from overconfident predictions during early denoising steps, causing the guidance gradient to vanish. This paper presents two complementary contributions to enhance classifier-guided diffusion models. First, we introduce a differentiable Smooth Expected Calibration Error (Smooth ECE) loss that improves classifier calibration with minimal fine-tuning, achieving approximately a 3\% improvement in Fréchet Inception Distance (FID) scores. Second, we propose enhanced sampling guidance methods that operate on off-the-shelf classifiers without requiring retraining. Our approach includes: (1) tilted sampling that leverages batch-level information to control outlier influence, (2) adaptive entropy-regularized sampling to maintain diversity, and (3) a novel divergence-regularized sampling method that adds a class-aware, mode-covering correction that strengthens movement toward the target class while maintaining exploration. Theoretical analysis reveals that our methods effectively combine enhanced
target direction guidance with controlled diversity exploration, mitigating gradient vanishing. Experimental results on ImageNet demonstrate that our best divergence-guided sampling achieves an FID of 2.13 while maintaining competitive precision and recall metrics. Our methods provide a practical solution for improving conditional generation quality without the computational overhead of classifier and diffusion model retraining.
Submission Number: 28
Loading