Position: An Empirically Grounded Identifiability Theory Will Accelerate Self Supervised Learning Research
TL;DR: To drive SSL forward, we should develop a principled theoretical understanding of SSL, grounded empirical observations
Abstract: Self-Supervised Learning (SSL) powers many current AI systems. As research interest and investment grow, the SSL design space continues to expand. The Platonic view of SSL, following the Platonic Representation Hypothesis (PRH), suggests that despite different methods and engineering approaches, all representations converge to the same Platonic ideal. However, this phenomenon lacks precise theoretical explanation. By synthesizing evidence from Identifiability Theory (IT), we show that the PRH can emerge in SSL. There is a gap between SSL theory and practice: Current IT cannot explain SSL's empirical success, though it has practically relevant insights. Our work formulates a blueprint for SSL research to bridge this gap: we propose expanding IT into what we term Singular Identifiability Theory (SITh), a broader theoretical framework encompassing the entire SSL pipeline. SITh would allow deeper insights into the implicit data assumptions in SSL and advance the field towards learning more interpretable and generalizable representations. We highlight three critical directions for future research: 1) training dynamics and convergence properties of SSL; 2) the impact of finite samples, batch size, and data diversity; and 3) the role of inductive biases in architecture, augmentations, initialization schemes, and optimizers.
Lay Summary: We pinpoint the gap between the empirical and theoretical advances in self-supervised representation learning (SSL): mostly that the focus and the research questions are different, and that there is not enough cross-pollination between the two communities. We use the lens of identifiability theory (IT) to propose a research agenda for SSL, which we believe can build upon, but needs to extend, current IT.
Primary Area: Research Priorities, Methodology, and Evaluation
Keywords: SSL, identifiability, Platonic Representation Hypothesis, model similarity, representation learning
Submission Number: 334
Loading