Abstract: Highlights•We improve the effectiveness of knowledge distillation by adapting the training of the teacher.•The trained teachers are not only applicable to the general KD, but they also promote various distillation methods including FitNet, AT, CRD, VID, DKD, SP, SRRL and NKD.
Loading