Self-supervised Disentangled Representation LearningDownload PDF

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Withdrawn SubmissionReaders: Everyone
Keywords: Disentanglement, Identifiability, Nonlinear ICA, Self-supervised Learning
Abstract: Disentanglement has been a central task in representation learning, which involves learning interpretable factors of variation in data. Recent efforts in this direction have been devoted to the identifiability problem of deep latent-variable model with the theory of nonlinear ICA, i.e. the true latent variables can be identified or recovered by the encoder. These identifiability results in nonlinear ICA are essentially based on supervised learning. This work extends these results to the scenario of self-supervised learning. First, we point out that a broad types of augmented data can be generated from a latent model. Based on this, we prove an identifiability theorem similar to the work by~\citep{khemakhem2019variational}: the latent variables for generating augmented data can be identified with some mild conditions. According to our proposed theory, we perform experiments on synthetic data and EMNIST with GIN~\citep{sorrenson2020disentanglement}. In our experiments, we find that even the data is only augmented along a few latent variables, more latent variables can be identified, and adding a small noise in data space can stabilize this outcome. Based on this, we augment digit images on EMNIST simply with three affine transformations and then add small Gaussian noise. It is shown that much more interpretable factors of variation can be successfully identified.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Supplementary Material: zip
Reviewed Version (pdf): https://openreview.net/references/pdf?id=CLDeLe6tb
5 Replies

Loading