Contraction and Alienation: Towards Theoretical Understanding of Non-Contrastive Learning with Neighbor-Averaging Dynamics

15 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Self-Supervised Finetuning, Non-Contrastive Learning
TL;DR: We reveal the properties of contraction and alienation in non-contrastive learning with the concise neighbor-averaging dynamics
Abstract: Non-contrastive self-supervised learning (SSL) is a popular paradigm for learning representations by explicitly aligning positive pairs. However, due to specialized implementation details, the underlying working mechanism of non-contrastive SSL remains somewhat mysterious. In this paper, we investigate the implicit bias of non-contrastive learning with a concise framework, namely SimXIR. SimXIR optimizes the online network by alternatively taking the online network of the last round as the target network, without requiring asymmetric tricks and momentum updates. Notably, the expectation minimization inherent to SimXIR can be reformulated as the *neighbor-averaging dynamics*, in which each representation is iteratively replaced with the average representation of its neighbors. Moreover, we introduce the concept of neighbor-connected groups that organize samples through the neighboring paths on data, and assume the input sample space is composed of multiple disjoint neighbor-connected groups. We theoretically prove that the concise dynamics of SimXIR exhibit two intriguing properties: *contraction of neighbor-connected groups* and *alienation between disjoint groups*, which resemble intra-class compactness and inter-class separability in classification and help explain why non-contrastive SSL can prevent collapsed solutions. Inspired by the theoretical results, we propose a novel step for self-supervised pre-training---self-supervised fine-tuning, and leverage SimXIR to further enhance representations of off-the-shelf SSL models. Experimental results demonstrate the effectiveness of SimXIR in improving self-supervised representations, ultimately achieving better performance on downstream classification tasks.
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 222
Loading