Abstract: A deep neural network (DNN) is said to be undistillable if, when used as a black-box input-
output teacher, it cannot be distilled through knowledge distillation (KD). In this case, the
distilled student (referred to as the knockoff student) does not outperform a student trained
independently with label smoothing (LS student) in terms of prediction accuracy. To protect
intellectual property of DNNs, it is desirable to build undistillable DNNs. To this end, it is
first observed that an undistillable DNN may have the trait that each cluster of its output
probability distributions in response to all sample instances with the same label should be
highly concentrated to the extent that each cluster corresponding to each label should ideally
collapse into one probability distribution. Based on this observation and by measuring the
concentration of each cluster in terms of conditional mutual information (CMI), a new
training method called CMI minimized (CMIM) method is proposed, which trains a DNN
by jointly minimizing the conventional cross entropy (CE) loss and the CMI values of all
temperature scaled clusters across the entire temperature spectrum. The resulting CMIM
model is shown, by extensive experiments, to be undistillable by all tested KD methods
existing in the literature. That is, the knockoff students distilled by these KD methods
from the CMIM model underperform the respective LS students. In addition, the CMIM
model is also shown to performs better than the model trained with the CE loss alone in
terms of their own prediction accuracy.
Submission Length: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=jVABSsD4Vf
Assigned Action Editor: ~Hanwang_Zhang3
Submission Number: 4164
Loading