Can Diffusion Models Disentangle? A Theoretical Perspective

Liming Wang; Muhammad Jehanzeb Mirza; Yishu Gong; Yuan Gong; Jiaqi Zhang; Brian H. Tracey; Katerina Placek; Marco Vilela; James R. Glass

Can Diffusion Models Disentangle? A Theoretical Perspective

Liming Wang, Muhammad Jehanzeb Mirza, Yishu Gong, Yuan Gong, Jiaqi Zhang, Brian H. Tracey, Katerina Placek, Marco Vilela, James R. Glass

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: diffusion models, disentanglement, learning theory

Abstract: This paper presents a novel theoretical framework for understanding how diffusion models can learn disentangled representations with commonly used weak supervision such as partial labels and multiple views. Within this framework, we establish identifiability conditions for diffusion models to disentangle latent variable models with \emph{stochastic}, \emph{non-invertible} mixing processes. We also prove \emph{finite-sample global convergence} for diffusion models to disentangle independent subspace models. To validate our theory, we conduct extensive disentanglement experiments on subspace recovery in latent subspace Gaussian mixture models, image colorization, denoising, and voice conversion for speech classification. Our experiments show that training strategies inspired by our theory, such as style guidance regularization, consistently enhance disentanglement performance.

Primary Area: Theory (e.g., control theory, learning theory, algorithmic game theory)

Submission Number: 3134

Loading