Keywords: diffusion models, guidance mechanisms, semantic fidelity
Abstract: Denoising diffusion models excel at conditional generation but face a trade-off under classifier-free guidance: large guidance scales improve semantic alignment, yet reduce diversity and cause distortions, especially when there exist hierarchical structures in the conditions. We propose Two-Period Guidance Diffusion (TPGD), a simple strategy that adapts the hierarchical guidance across the denoising process. More specifically, TPGD applies coarse guidance in early steps to establish global structure, then switches to stronger guidance in later steps to refine details. Analysis under a Gaussian mixture model shows that TPGD achieves better alignment with the target distribution than standard guidance. Experiments on text-to-image benchmarks further demonstrate that TPGD consistently enhances semantic fidelity while preserving diversity, providing a principled and effective alternative to fixed-scale guidance.
Primary Area: generative models
Submission Number: 24084
Loading