Keywords: knowledge distillation, hierarchical distillation, self-supervision, cross-scale image processing.
Abstract: Knowledge distillation, a widely used model compression technique, enables a smaller student network to replicate the performance of a larger teacher network by transferring knowledge, typically in the form of softened class probabilities or feature representations. However, current approaches often fail to maximize the teacher’s feature extraction capabilities, as they treat the semantic information transfer between teacher and student as equal. This paper presents a novel framework that addresses this limitation by enhancing the teacher’s learning process through the Variable Scale Distillation Framework. Central to our approach is the Rescale Block, which preserves scale consistency during hierarchical distillation, allowing the teacher to extract richer, more informative features. In extensive experiments on the CIFAR100 dataset, our method consistently outperforms state-of-the-art distillation techniques, achieving an average accuracy improvement of 2.12\%. This demonstrates the effectiveness of our approach in fully leveraging the teacher’s capacity to guide the student, pushing the boundaries of knowledge distillation.
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1540
Loading