Feature Modulating for Diffusion Models

17 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Diffusion models, Text to Image, Generative models
TL;DR: We propose Feature Modulating (FM), a training-free approach that enables image quality improvement and better text-image alignment in text-to-image diffusion models.
Abstract: We propose Feature Modulating (FM), a training-free approach that enables image quality improvement and better text-image alignment in text-to-image diffusion models. Rather than relying on additional input information, our heuristic FM module directly modulates latent features during the denoising steps. The modulation alters the feature distribution, leading to differences in the generated images. To explore the impact of feature modulation, we introduce a channel-wise monotonic modulation function that adjusts feature values using a single parameter, facilitating the obtainment of high quality images. The FM module is architecture-agnostic and can be integrated into existing diffusion models. Extensive experiments across multiple benchmarks demonstrate the ability of our feature modulation to enhance image quality, semantic fidelity and realism.
Supplementary Material: zip
Primary Area: generative models
Submission Number: 8897
Loading