Abstract: Highlights•In knowledge distillation (KD), a pre-trained teacher model on finite samples causes overconfidence.•A coded knowledge distillation (CKD) distils a more generalized output to the student model.•Compared to KD, the modified teacher in CKD has a front layer to encode an input image adaptively.•The CKD framework is realized using an adaptive encoding method via JPEG compression.•The thorough experiments conducted in the paper prove the effectiveness of CKD.
Loading