Noise Consistency Training: A Native Approach for One-step Generator in Learning Additional Controls
Keywords: One-step Controllable Generation
Abstract: The pursuit of efficient and controllable high-quality content generation stands as a pivotal challenge in artificial intelligence-generated content (AIGC).
While one-step generators, refined through diffusion distillation techniques, offer excellent generation quality and computational efficiency, adapting them to new control conditions—such as structural constraints, semantic guidelines, or external inputs—poses a significant challenge.
Conventional approaches often necessitate computationally expensive modifications to the base model and subsequent diffusion distillation.
This paper introduces Noise Consistency Training (NCT), a novel and lightweight approach to directly integrate new control signals into pre-trained one-step generators without requiring access to original training images or retraining the base diffusion model.
NCT operates by introducing an adapter module and employs a noise consistency loss in the noise space of the generator.
This loss aligns the adapted model's generation behavior across noises that are conditionally dependent to varying degrees, implicitly guiding it to adhere to the new control.
This training objective can be interpreted as aligning the adapted generator with the intractable conditional distribution defined by a discriminative model and the one-step generator from moment-matching perspectives.
NCT is modular, data-efficient, and easily deployable, relying only on the pre-trained one-step generator and a control signal model. Extensive experiments demonstrate that NCT achieves state-of-the-art controllable generation in a single forward pass, surpassing existing multi-step and distillation-based methods in both generation quality and computational efficiency.
Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)
Submission Number: 20461
Loading