VFT: A versatile fine-tuning scheme based on feature distribution-aware knowledge distillation for lightweight convolutional neural networks

Published: 2025, Last Modified: 10 Nov 2025Eng. Appl. Artif. Intell. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We introduce the feature distribution knowledge distillation (FDKD) fine-tuning.•FDKD captures feature distribution and works for both KD fine-tuning and scratch training.•Layer-wise FDKD improves recovery when fine-tuning same-structure compressed models.•Our method outperforms standard and KD fine-tuning on four benchmark datasets.
Loading