Rethinking Metric Based Contrastive Learning Method’s Generalization CapabilityDownload PDF

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone
Abstract: In recent years, semi-supervised/self-supervised methods based on contrastive learning have made great empirical progress in various fields of deep learning, and even outperform supervised methods in some fields (such as NLP and CV). However, there are very few theoretical works that may explain why the model trained using contrastive learning-based methods can outperform the model trained in general supervised methods on supervised tasks. Based on the manifold assumption about the input space, this work proposes three elements of metric-based contrastive learning:(1) Augmented neighborhood defined for every point in the input space (2) Metric-based optimization loss on the output space. (3) Generalization error on the union of the augmented neighborhood. Moreover, we propose an upper bound of (3) named UBGEAN(Upper Bound of Generalization Error on Augmented Neighborhood) which relate to labeled empirical loss and unlabeled metric-based contrastive loss. We also explain the relationship between the existing contrastive semi-supervised/self-supervised methods and our upper bound. Finally, based on it, we propose a supervised consistent contrastive learning method based on this upper bound. we verify the validity of the UBGEAN's generalization capacity against empirical loss by conducting a series of experiments and achieving an 8.2275% improvement on average in 4 tasks. Also, we design another set of experiments to verify the fine-tuning of the self-supervised training model of contrast learning, and it shows that our upper bound can provide a more stable effect to make the self-supervised pre-trained model of contrast learning achieve the effect of supervised pre-training model.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Unsupervised and Self-supervised learning
Supplementary Material: zip
5 Replies

Loading