VFT: A versatile fine-tuning scheme based on feature distribution-aware knowledge distillation for lightweight convolutional neural networks
Abstract: Highlights•We introduce the feature distribution knowledge distillation (FDKD) fine-tuning.•FDKD captures feature distribution and works for both KD fine-tuning and scratch training.•Layer-wise FDKD improves recovery when fine-tuning same-structure compressed models.•Our method outperforms standard and KD fine-tuning on four benchmark datasets.
External IDs:dblp:journals/eaai/HongK25
Loading