Self-Supervised Learning from Structural Invariance

Yipeng Zhang; Hafez Ghaemi; Jungyoon Lee; Laurent Charlin

Self-Supervised Learning from Structural Invariance

Yipeng Zhang, Hafez Ghaemi, Jungyoon Lee, Laurent Charlin

Published: 23 Sept 2025, Last Modified: 27 Nov 2025NeurReps 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: self-supervised learning, representation learning, disentanglement

Abstract: Joint-embedding *self-supervised learning* (SSL), the key paradigm for unsupervised representation learning from visual data, learns from invariances between semantically-related data pairs. We study the one-to-many mapping problem in SSL, where each datum may be mapped to multiple valid targets. This arises when data pairs come from naturally occurring generative processes, e.g., successive video frames. We show that existing methods struggle to flexibly capture this conditional uncertainty. As a remedy, we introduce a variational distribution that models this uncertainty in the latent space, and derive a lower bound on the pairwise mutual information. We also propose a simpler variant of the same idea using sparsity regularization. Our model, AdaSSL, applies to both contrastive and predictive SSL methods, and we empirically show its advantages on identifiability, generalization, fine-grained image understanding, and world modeling on videos.

Submission Number: 101

Loading