Semantic Decoupled Distillation

19 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Knowledge distillation, classification
TL;DR: We decouple the global logit knowledge as consistent and complementary local logit knowledge to enhance the logit distillation
Abstract: Logit knowledge distillation attracts increasing attention due to its practicality in recent studies. This paper argues that existing logit-based methods may be sub-optimal since they only leverage the global logit output coupled with multiple semantic knowledge. To this end, we propose a simple but effective method, i.e., semantic decoupled distillation (SDD), for logit knowledge distillation. SDD decouples the logit output as multiple local outputs and establishes the transferring pipeline for them. This helps the student to mine and inherit richer and unambiguous logit knowledge. Besides, the decoupled knowledge can be further divided into consistent and complementary logit knowledge that transfers the multi-scale information and sample ambiguity, respectively. SDD introduces dynamic weights for them to adapt to different tasks and data scenes. Extensive experiments on several benchmark datasets demonstrate the effectiveness of SDD for wide teacher-student pairs, especially in the fine-grained classification task.
Primary Area: general machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1542
Loading