Keywords: Disentanglement, Equivariance, Topology, Representation theory, Character theory
Abstract: A core challenge in Machine Learning is to disentangle natural factors of variation in data (e.g. object shape vs pose). A popular approach to disentanglement consists in learning to map each of these factors to distinct subspaces of a model's latent representation. However, this approach has shown limited empirical success to date. Here, we show that this approach to disentanglement introduces topological defects (i.e. discontinuities in the encoder) for a broad family of transformations acting on images ---encompassing simple affine transformations such as rotations and translations. Moreover, motivated by classical results from group representation theory, we propose an alternative, more flexible approach to disentanglement which relies on distributed equivariant operators, potentially acting on the entire latent space. We theoretically and empirically demonstrate the effectiveness of our approach to disentangle affine transformations. Our work lays a theoretical foundation for the recent success of a new generation of models using distributed operators for disentanglement (see Discussion).
One-sentence Summary: We use topological arguments to show that disentanglement as commonly defined introduces discontinuities in the encoder, which leads us to propose a new approach to disentanglement through distributed equivariant operators.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Reviewed Version (pdf): https://openreview.net/references/pdf?id=DC9Yi7D3cY
19 Replies
Loading