Deciphering the Projection Head: Representation Evaluation Self-supervised Learning

Jiajun Ma

Published: 01 Aug 2024, Last Modified: 28 Sept 2024IJCAI2024EveryoneCC BY 4.0

Abstract: Self-supervised learning (SSL) aims to learn the intrinsic features of data without labels. Despite the diverse SSL architectures, the projection head always plays an important role in improving downstream task performance. In this study, we systematically investigate the role of the projection head in SSL. We find that the projection head targets the uniformity aspect, which maps samples into uniform distribution and enables the encoder to focus on extracting semantic features. Drawing on this insight, we propose a Representation Evaluation Design (RED) in SSL models in which a shortcut connection between the representation and the projection vectors is built. Our extensive experiments with different architectures (including SimCLR, MoCo-V2, and SimSiam) on various datasets demonstrate that the RED-SSL consistently outperforms their baseline counterparts in downstream tasks. Furthermore, the RED-SSL learned representations exhibit superior robustness to previously unseen augmentations and out-of-distribution data.