Using Representation Expressiveness and Learnability to Evaluate Self-Supervised Learning Methods
Abstract: We address the problem of evaluating the quality of self-supervised learning (SSL) models without access to supervised labels, while being agnostic to the architecture, learning algorithm or data manipulation used during training. We argue that representations can be evaluated through the lens of expressiveness and learnability. We propose to use the Intrinsic Dimension (ID) to assess expressiveness and introduce Cluster Learnability (CL) to assess learnability. CL is measured in terms of the performance of a KNN classifier trained to predict labels obtained by clustering the representations with K-means. We thus combine CL and ID into a single predictor – CLID. Through a large-scale empirical study with a diverse family of SSL algorithms, we find that CLID better correlates with in-distribution model performance than other competing recent evaluation schemes. We also benchmark CLID on out-of-domain generalization, where CLID serves as a predictor of the transfer performance of SSL models on several visual classification tasks, yielding improvements with respect to the competing baselines.
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Length: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=pWqrWG6zw2&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DTMLR%2FAuthors%23your-submissions)
Changes Since Last Submission: As claryfied to us by the AE (see https://openreview.net/forum?id=pWqrWG6zw2¬eId=h9VFIJpszC), the main criticism concerns the dependence of our cluster learnability metric of the chosen ordering of the dataset. In this new submission, we eliminate this problem by moving away from our initial prequential approach to cluster learnability. We now refer to cluster learnability as the **average validation accuracy** of a KNN classifier trained on the labelled dataset obtained by clustering the representation with K-means. We reproduced our experiments using this (indeed much simpler) definition (in fact all our correlations results remain unchanged).
Supplementary Material: pdf
Assigned Action Editor: ~Alexander_A_Alemi1
Submission Number: 1269