Learning Disentangled Representations in Deep Generative Models

N. Siddharth; Brooks Paige; Alban Desmaison; Jan-Willem van de Meent; Frank Wood; Noah D. Goodman; Pushmeet Kohli; Philip H.S. Torr

Learning Disentangled Representations in Deep Generative Models

N. Siddharth, Brooks Paige, Alban Desmaison, Jan-Willem van de Meent, Frank Wood, Noah D. Goodman, Pushmeet Kohli, Philip H.S. Torr

22 Dec 2025 (modified: 21 Jul 2022)Submitted to ICLR 2017Readers: Everyone

Abstract: Deep generative models provide a powerful and flexible means to learn complex distributions over data by incorporating neural networks into latent-variable models. Variational approaches to training such models introduce a probabilistic encoder that casts data, typically unsupervised, into an entangled and unstructured representation space. While unsupervised learning is often desirable, sometimes even necessary, when we lack prior knowledge about what to represent, being able to incorporate domain knowledge in characterising certain aspects of variation in the data can often help learn better disentangled representations. Here, we introduce a new formulation of semi-supervised learning in variational autoencoders that allows precisely this. It permits flexible specification of probabilistic encoders as directed graphical models via a stochastic computation graph, containing both continuous and discrete latent variables, with conditional distributions parametrised by neural networks. We demonstrate how the provision of structure, along with a few labelled examples indicating plausible values for some components of the latent space, can help quickly learn disentangled representations. We then evaluate its ability to do so, both qualitatively by exploring its generative capacity, and quantitatively by using the disentangled representation to perform classification, on a variety of models and datasets.

Conflicts: microsoft.com, stanford.edu, robots.ox.ac.uk, northeastern.edu, eng.ox.ac.uk

Keywords: Semi-Supervised Learning, Deep learning, Computer vision

12 Replies

Loading