Knowledge distillation using unlabeled mismatched images

Mandar Kulkarni; Kalpesh Patil; Shirish Karande

Knowledge distillation using unlabeled mismatched images

Mandar Kulkarni, Kalpesh Patil, Shirish Karande

05 Jul 2025 (modified: 12 Mar 2017)ICLR 2017Readers: Everyone

Abstract: Current approaches for Knowledge Distillation (KD) either directly use training data or sample from the training data distribution. In this paper, we demonstrate effectiveness of 'mismatched' unlabeled stimulus to perform KD for image classification networks. For illustration, we consider scenarios where this is a complete absence of training data, or mismatched stimulus has to be used for augmenting a small amount of training data. We demonstrate that stimulus complexity is a key factor for distillation's good performance. Our examples include use of various datasets for stimulating MNIST and CIFAR teachers.

TL;DR: Distilling knowledge from neural networks under the assumption that the training data is not available.

Conflicts: tcs.com

Keywords: Deep learning, Transfer Learning

7 Replies

Loading