Towards Enhanced Controllability of Diffusion Models

20 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Supplementary Material: pdf
Primary Area: generative models
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Diffusion Models, Generative Models, Controllability, Editability
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: As Diffusion Models have shown remarkable capabilities in generating images, the controllability of Diffusion Models has received much attention. However, there is still room for improvement of controllability in some aspects, such as feature disentanglement of Diffusion Models for extended editability and composing multiple conditions naturally. In this paper, we present three methods that can be used in either training or sampling to enhance the controllability of Diffusion Models. Concisely, we train Diffusion Models conditioned on two latent codes, a spatial content mask, and a flattened style embedding. We rely on the inductive bias of the progressive denoising process of Diffusion Models to encode pose/layout information in the spatial structure mask and semantic/style information in the style code. We also propose two generic sampling techniques for improving controllability. First, we extend Composable Diffusion Models to allow for some dependence between conditional inputs, to improve the quality of generations while also providing control over the amount of guidance from each condition and their joint distribution. Second, we propose timestep-dependent weight scheduling for content and style latents to further improve the translations. We observe better controllability compared to existing methods and show that with our proposed methods, Diffusion Models can be used for effective image manipulation and image translation.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2769
Loading