Simplifying Knowledge Transfer in Pretrained Models

TMLR Paper5203 Authors

25 Jun 2025 (modified: 07 Jul 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Pretrained models are ubiquitous in the current deep learning landscape, offering strong results on a broad range of tasks. Recent works have shown that models differing in various design choices exhibit categorically diverse generalization behavior, resulting in one model grasping distinct data-specific insights unavailable to the other. In this paper, we propose to leverage large publicly available model repositories as an auxiliary source of model improvements. We introduce a data partitioning strategy where pretrained models autonomously adopt either the role of a student, seeking knowledge, or that of a teacher, imparting knowledge, fostering a collaborative learning environment. Experiments across various tasks demonstrate the effectiveness of our proposed approach. In image classification, we improved the performance of ViT-B by approximately 1.4\% through bidirectional knowledge transfer with ViT-T. For semantic segmentation, our method boosted all evaluation metrics by enabling knowledge transfer both within and across backbone architectures. In video saliency prediction, our approach achieved a new state-of-the-art. We further extend our approach to knowledge transfer between multiple models, leading to considerable performance improvements for all model participants.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Stephen_Lin1
Submission Number: 5203
Loading