Understanding Self-supervised Contrastive Learning through Supervised Objectives

TMLR Paper5017 Authors

03 Jun 2025 (modified: 13 Aug 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Self-supervised representation learning has achieved impressive empirical success, yet its theoretical understanding remains limited. In this work, we provide a theoretical perspective by formulating self-supervised representation learning as an approximation to supervised representation learning objectives. Based on this formulation, we derive a loss function closely related to popular contrastive losses such as InfoNCE, offering insight into their underlying principles. Our derivation naturally introduces the concepts of prototype representation bias and a balanced contrastive loss, which help explain and improve the behavior of self-supervised learning algorithms. We further show how components of our theoretical framework correspond to established practices in contrastive learning. Finally, we empirically validate the effect of balancing positive and negative pair interactions. All theoretical proofs are provided in the appendix, and our code is included in the supplementary material.
Submission Length: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=OrzGlOmNJa
Changes Since Last Submission: We sincerely thank the reviewers and the action editor for their thoughtful and detailed feedback. As the primary concerns in the previous submission related to how the ideas and contributions were communicated, we focused this revision on significantly improving the overall exposition and clarity of the manuscript. The key changes are summarized below. 1. **Scope Clarification**: We revised the title, abstract, and introduction to indicate that our work focuses specifically on self-supervised contrastive learning. 2. **Contribution Clarification**: We carefully revised our claims to more accurately reflect what is implied by our theoretical and empirical results (Sections 5 and 6). We also ensured that our formulation is presented as a principled perspective on contrastive loss functions, rather than a universal foundation, and revised language that could potentially be misinterpreted as overstating our contribution. 3. **Improved Positioning**: We revised the related work section to situate our approach within the broader literature on contrastive learning and perspectives on self-supervised learning. 4. **Improved Readability**: We revised the writing throughout the manuscript to improve clarity and ensure that the core ideas and contributions are more easily understood by readers. 5. **Bound Tightness**: We added experiments and a discussion on the tightness of the bounds (Figure 6 and Section A.4.5).
Assigned Action Editor: ~Han_Bao2
Submission Number: 5017
Loading