Slimmed Asymmetrical Contrastive Learning and Cross Distillation for Lightweight Model Training

Published: 21 Sept 2023, Last Modified: 02 Nov 2023NeurIPS 2023 posterEveryoneRevisionsBibTeX
Keywords: Contrastive Learning, Self-supervised Learning, Energy-efficient contrastive learning
TL;DR: A novel contrastive learning algorithm designed for lightweight model training from scratch.
Abstract: Contrastive learning (CL) has been widely investigated with various learning mechanisms and achieves strong capability in learning representations of data in a self-supervised manner using unlabeled data. A common fashion of contrastive learning on this line is employing mega-sized encoders to achieve comparable performance as the supervised learning counterpart. Despite the success of the labelless training, current contrastive learning algorithms *failed* to achieve good performance with lightweight (compact) models, e.g., MobileNet, while the requirements of the heavy encoders impede the energy-efficient computation, especially for resource-constrained AI applications. Motivated by this, we propose a new self-supervised CL scheme, named SACL-XD, consisting of two technical components, **S**limmed **A**symmetrical **C**ontrastive **L**earning (SACL) and **Cross**-**D**istillation (XD), which collectively enable efficient CL with compact models. While relevant prior works employed a strong pre-trained model as the teacher of unsupervised knowledge distillation to a lightweight encoder, our proposed method trains CL models from scratch and outperforms them even without such an expensive requirement. Compared to the SoTA lightweight CL training (distillation) algorithms, SACL-XD achieves 1.79% ImageNet-1K accuracy improvement on MobileNet-V3 with 64$\times$ training FLOPs reduction.
Supplementary Material: pdf
Submission Number: 9997