Squeezing Water from a Stone: Improving Pre-Trained Self-Supervised Embeddings Through Effective Entropy Maximization

Deep Chakraborty; Tim G. J. Rudner; Erik Learned-Miller

Squeezing Water from a Stone: Improving Pre-Trained Self-Supervised Embeddings Through Effective Entropy Maximization

Deep Chakraborty, Tim G. J. Rudner, Erik Learned-Miller

Published: 13 Oct 2024, Last Modified: 02 Dec 2024NeurIPS 2024 Workshop SSLEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Information theory, entropy maximization, self-supervised learning, representation learning

TL;DR: We propose a simple add-on SSL method to improve state-of-the-art SSL embeddings using only few additional epochs of continued pre-training.

Abstract: A number of different architectures and loss functions have been applied to the problem of self-supervised learning (SSL), with the goal of developing embeddings that provide the best possible pre-training for as-yet-unknown, lightly supervised downstream tasks. One of these SSL criteria is to maximize the entropy of a set of embeddings in some compact space. But the goal of maximizing the embedding entropy often depends—whether explicitly or implicitly—upon high dimensional entropy estimates, which typically perform poorly in more than a few dimensions. In this paper, we motivate a simple maximum-entropy criterion, defined in terms of easy-to-estimate, low-dimensional constraints, and demonstrate that using it to continue training an already-trained SSL model for only a handful of epochs leads to a consistent and, in some cases, significant improvement in performance.

Submission Number: 14

Loading