Teach sample-specific knowledge: Separated distillation based on samples

Published: 2025, Last Modified: 04 Nov 2025Eng. Appl. Artif. Intell. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We highlight the limitations of previous KD methods based on forward KL divergence.•Datasets are split to tackle mode-averaging and teacher errors in uncertain images.•Correct samples use RKLD loss, while incorrect samples encourage student self-learning.•Our method achieves superior performance in both classification and object detection.
Loading