Keywords: Diffusion model, latent variable models, disentanglement
TL;DR: We propose a method of providing interpretable meaning to latent variables of diffusion models.
Abstract: Latent variable models are useful tools for discovering independent generative factors of data without human supervision. From an ODE formulation, diffusion models are invertible latent variable models, but unlike other models like VAEs, their latent variables are often not interpretable. For example, traversing a single element of the latent noise does not lead to a meaningful variation of generated contents. To settle this issue, we propose to divide a latent vector into multiple groups of elements and design different noise schedules for each group. By doing so, we can allow each group to control only certain elements of data, explicitly giving interpretable meaning. Applying our method in the frequency domain, the latent variable becomes a hierarchical representation where individual groups encode data at different levels of abstraction. We show several applications of such representation including disentanglement of semantic attributes or image editing.
Submission Number: 6
Loading