Rectifying Diffusion Guidance with Exponential Moving Average

Wooyeol Baek; Seongdo Kim; Jinseong Kim; Haechan Shin; Yiwon Yu; Jongyoo Kim

Rectifying Diffusion Guidance with Exponential Moving Average

Wooyeol Baek, Seongdo Kim, Jinseong Kim, Haechan Shin, Yiwon Yu, Jongyoo Kim

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: diffusion models, flow models, guidance, conditional generation

TL;DR: We propose Rectified EMA Guidance (REG), a training-free method that uses a model's prediction EMA to rectify guidance, preventing oversaturation while improving quality and diversity without auxiliary components.

Abstract: Continuous-time generative models, such as diffusion or flow models, employ Classifier-Free Guidance (CFG) to generate high-quality samples. However, CFG sacrifices sample diversity at the cost of sample quality, which leads to oversimplification. Furthermore, at high guidance scales, excessively accumulated guidance term causes oversaturation and leads to degraded sample quality. Previous attempts to solve this problem often require auxiliary components to construct "weak" version of the model, which hinders their adoption for publicly released models, where auxiliary components are accessible. In this paper, we revisit the property that the Exponential Moving Average (EMA) of a model's predictions during sampling phase acts as a low-pass filter. By suppressing high frequencies, the EMA of a model's predictions inherently degrades sample quality, allowing it to be leveraged as the "weak" version, while still retaining conditional components. To take advantage of this property, we introduce Rectified EMA Guidance (REG), a simple yet effective training-free approach. REG rectifies guidance term with the difference between the EMAs of the model's conditional and unconditional predictions. This rectified term offsets amplified conditional components, playing a crucial role in preserving sample diversity while improving quality. We validated REG on both class-conditional and text-conditional (industrial) models and demonstrate that it improves sample quality and preserves diversity by preventing oversaturation and oversimplification, even in high-guidance scenarios. To the best of our knowledge, REG is the first training-free guidance method that improves sample quality orthogonally and does not requires auxiliary components or specific model architecture, or modality. Therfore, it is applicable to a wide range of generative models, including large-scale public and industrial models.

Supplementary Material: zip

Primary Area: generative models

Submission Number: 24653

Loading