Simplifying Knowledge Transfer in Pretrained Models

Siddharth Jain; Shyamgopal Karthik; Vineet Gandhi

Simplifying Knowledge Transfer in Pretrained Models

Siddharth Jain, Shyamgopal Karthik, Vineet Gandhi

Published: 18 Sept 2025, Last Modified: 18 Sept 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Pretrained models are ubiquitous in the current deep learning landscape, offering strong results on a broad range of tasks. Recent works have shown that models differing in various design choices exhibit categorically diverse generalization behavior, resulting in one model grasping distinct data-specific insights unavailable to the other. In this paper, we propose to leverage large publicly available model repositories as an auxiliary source of model improvements. We introduce a data partitioning strategy where pretrained models autonomously adopt either the role of a student, seeking knowledge, or that of a teacher, imparting knowledge. Experiments across various tasks demonstrate the effectiveness of our proposed approach. In image classification, we improved the performance of ViT-B by approximately 1.4\% through bidirectional knowledge transfer with ViT-T. For semantic segmentation, our method boosted all evaluation metrics by enabling knowledge transfer both within and across backbone architectures. In video saliency prediction, our approach achieved a new state-of-the-art. We further extend our approach to knowledge transfer between multiple models, leading to considerable performance improvements for all model participants.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: Submitted the camera ready version along with confidence levels for Table 2.

Video: https://youtu.be/VCVd_1LjTTI

Code: https://github.com/Syd-J/Bi-KD

Assigned Action Editor: ~Stephen_Lin1

Submission Number: 5203

Loading