SimVAE: Narrowing the gap between Discriminative & Generative Representation Learning

Published: 02 Nov 2023, Last Modified: 18 Dec 2023UniReps PosterEveryoneRevisionsBibTeX
Supplementary Material: pdf
Keywords: Probabilistic representation learning; Self-supervised representation learning; Variational Methods; Generative Modelling;
TL;DR: Motivated by a theoretical analysis of existing self-supervised representation learning methods, we propose a unifying graphical model which improves the performance VAE-based methods on downstream task.
Abstract: Self-supervised representation learning is a powerful paradigm that leverages the relationship between semantically similar data, such as augmentations, extracts of an image or sound clip, or multiple views/modalities. Recent methods, e.g. SimCLR, CLIP and DINO, have made significant strides, yielding representations that achieve state-of-the-art results on multiple downstream tasks. Though often intuitive, a comprehensive theoretical understanding of their underlying mechanisms or _what_ they learn eludes. Meanwhile, generative approaches, such as variational autoencoders (VAEs), fit a specific latent variable model and have principled appeal, but lag significantly in terms of performance. We present a theoretical analysis of self-supervised discriminative methods and a graphical model that reflects the assumptions they implicitly make and unifies these methods. We show that fitting this model under an ELBO objective improves representations over previous VAE methods on several common benchmarks, narrowing the gap to discriminative methods, and can also preserve information lost by discriminative approaches. This work brings new theoretical insight to modern machine learning practice.
Track: Extended Abstract Track
Submission Number: 51