Progressive Image Synthesis from Semantics to Details with Denoising Diffusion GAN

Guoxing Yang; Haoyu Lu; Guang Zhou; Haoran Wu; Zhiwu Lu

Progressive Image Synthesis from Semantics to Details with Denoising Diffusion GAN

Guoxing Yang, Haoyu Lu, Guang Zhou, Haoran Wu, Zhiwu Lu

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Image generation, GANs, diffusion model, progressive generation

TL;DR: We propose a novel progressive method for image synthesis from semantics to details with diffusion denoising GAN.

Abstract: Image generation has been dominated by generative adversarial Networks (GANs) due to its superior ability to generate realistic images. Recently, by decomposing the image generation process into a sequence of denoising steps, denoising diffusion probabilistic models (DDPMs) have shown remarkable sample quality and diversity in image generation. However, DDPMs typically face two main challenges (but GANs do not): the time-expensive sampling process and the semantically meaningless latent space. Although these two challenges start to draw attention in recent works on DDPMs, they are often addressed separately. In this paper, by interpreting the sampling process of DDPMs in a new way with a special noise scheduler, we propose a novel progressive training pipeline to address these two challenges simultaneously. Concretely, we choose to decompose the sampling process into two stages: generating semantics firstly and then refining details progressively. As a result, we are able to interpret the sampling process of DDPMs as a refinement process instead of a denoising process, when the DDPMs try to predict the real images at each time step. Motivated by such new interpretation, we present a novel training pipeline that progressively transforms the attention from semantics to sample quality during training. Extensive results on two benchmarks show that our proposed diffusion model achieves competitive results with as few as two sampling steps on unconditional image generation. Importantly, the latent space of our diffusion model is shown to be semantically meaningful, which can be exploited on various downstream tasks (e.g., attribute manipulation).

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Generative models

5 Replies

Loading