Black-box Knowledge Distillation

Ying Jin; Jiaqi Wang; Dahua Lin

Black-box Knowledge Distillation

Ying Jin, Jiaqi Wang, Dahua Lin

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone

Keywords: black-box model, knowledge distillation

TL;DR: We introduce an approach for black-box knowledge distillation via prediction augmentations and multi-level prediction alignment.

Abstract: Knowledge Distillation (KD) aims at distilling the knowledge from the large teacher model to a light-weight student model. Enhancing model efficiency effectively, mainstream methods often rely on the assumption that the teacher model is white-box (i.e., visible during distillation). However, this assumption does not always hold due to commercial, privacy, or safety concerns, which hinders these strong methods from being applied. Towards this dilemma, in this paper, we consider black-box knowledge distillation, an interesting yet challenging problem which aims at distilling teacher knowledge when merely the teacher predictions are accessible (i.e., the teacher model is invisible). Some early KD methods can be directly applied to black-box knowledge distillation, but the performance appears to be unsatisfactory. In this paper, we propose a simple yet effective approach, which makes better utilization of teacher predictions with prediction augmentation and multi-level prediction alignment. Through this framework, the student model learns from more diverse teacher predictions. Meanwhile, the prediction alignment is not only conducted at the instance level, but also at the batch and class level, through which the student model learns instance prediction, input correlation, and category correlation simultaneously. Extensive experiment results validate that our method enjoys consistently higher performance than previous black-box methods, and even reaches competitive performance with mainstream white-box methods. We promise to release our code and models to ensure reproducibility.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning

Supplementary Material: zip

5 Replies

Loading