Rescaling Intermediate Features Makes Trained Consistency Models Perform Better

Published: 19 Mar 2024, Last Modified: 10 May 2024Tiny Papers @ ICLR 2024 NotableEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Consistency Models, Diffusion Models, Representation, Image Generation
TL;DR: Rescaling the intermediate features of trained consistency models during the inference could potentially improve the generation quality.
Abstract: In the domain of deep generative models, diffusion models are renowned for their high-quality image generation but are constrained by intensive computational demands. To mitigate this, consistency models have been proposed as a computationally efficient alternative. Our research reveals that post-training rescaling of internal features can enhance the one-step sample quality of these models without incurring detectable computational overhead. This optimization is evidenced by an obvious improvement in Fréchet Inception Distance (FID). For example, with our rescaled consistency distillation (CD) model, FID on the ImageNet dataset reduces from 6.2 to 5.2, on the LSUN-cat dataset from 10.9 to 9.5. Closer inspection of the generated images reveals that this enhancement may originate from improved visual details and clarity.
Submission Number: 111
Loading