Minimizing Parameter Overhead in Self-Supervised Models for Target Task

Jaydeep Kishore, Snehasis Mukherjee

Published: 2024, Last Modified: 27 Feb 2026IEEE Trans. Artif. Intell. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Supervised deep learning models encounter two major challenges: labeled datasets for training and parameter overhead, which leads to extensive GPU usage and other computational resource requirements. Several CNN models show state-of-the-art performance while compromising with either of the challenges. Self-supervised models reduce the requirement for labeled training data; however, the problems of parameter overhead and GPU usage are rarely addressed. This article proposes a method to address the two challenges of the image classification task. We introduce a transfer learning approach for a target dataset, in which we take the learned features from a self-supervised model after minimizing its parameters by removing the final layer. The learned features are then fed into a CNN, followed by a multilayer perceptron (MLP), where the hyperparameters of both the CNN and the MLP are automatically tuned (autotuned) using a Bayesian-optimization-based technique. Furthermore, we reduce giga floating point operations per second (GFLOPs) by limiting the search space for the hyperparameters, not compromising the performance. The first challenge is addressed by utilizing the learned representations from the self-supervised model as a foundation for knowledge transfer in the proposed model. Rather than relying solely on labeled data, we employ the insights from unlabeled data by transferring knowledge from self-supervised models to the target task, hence reducing the cost and effort associated with data annotation. We address the second challenge by utilizing a minimized self-supervised backbone model and constraining the search space. We experiment with a wide variety of benchmark datasets, such as CIFAR-10, CIFAR-100, Oxford-IIIT-Pet, Oxford-102-flowers and Caltech-101, to establish the efficacy.

External IDs:dblp:journals/tai/KishoreM24