Abstract: Highlights•Conducted an in-depth analysis of knowledge distillation employing the ideal joint classifier.•Presented a comprehensive proof establishing the error bounds of the student network under a function of the teacher’s error.•Introduced a novel knowledge distillation framework grounded in the ideal joint classifier assumption.
Loading