Keywords: self-supervised learning, contrastive learning, lightweight model
TL;DR: We successfully improve the linear evaluation results from 36.3\% to 62.3\% of MobileNet-V3-Large and from 42.2\% to 65.8\% of EfficientNet-B0 on ImageNet, closing the accuracy gap to ResNet-50 which contains $5\times$ parameters.
Abstract: While self-supervised contrastive learning has made continuous progress utilizing big models, the performance lags far behind when the model size decreases. A common practice to address this problem requires a two-stage training procedure, where a larger model is pretrained in a self-supervised manner first, then its representational knowledge is transferred to a smaller model in the second stage. Despite its effectiveness, this method is highly time-consuming and is inapplicable to some resource-limited scenarios. In this work, we are aiming at directly training a lightweight contrastive model with satisfactory performance in the absence of a pretrained teacher model. Specifically, by empirically exploring the training recipes (e.g., MLP, lower temperature, et al), we boost the accuracy of different lightweight models by a large margin. Besides, we observe that smaller models are more sensitive to noisy labels, and propose a smooth version of InfoNCE loss to alleviate this problem. With these combined techniques, we successfully improve the linear evaluation results from 36.3\% to 62.3\% of MobileNet-V3-Large and from 42.2\% to 65.8\% of EfficientNet-B0 on ImageNet, closing the accuracy gap to ResNet-50 which contains $5\times$ parameters. These results suggest the feasibility to train lightweight self-supervised models without distillation.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning
Supplementary Material: zip
6 Replies
Loading