Discover the Effective Strategy for Face Recognition Model Compression by Improved Knowledge Distillation

Abstract: For the sake of better accuracy, the face recognition model is becoming larger and larger, which makes them difficult to be deployed on embedded systems. This work proposes an effective model compression method using knowledge distillation, where a fast student model is trained under the guidance of a complex teacher model. Firstly, different loss combinations and network architectures are analyzed through comprehensive experiments to find the most effective approach. To augment the performance, the feature layer is further normalized to make the optimization objective consistent with cosine similarity metric. Moreover, a teacher weighting strategy is proposed to address the issue when teacher provides wrong guidance. Experimental results show that the student model built by our approach can surpass the teacher model while achieving 3× acceleration.
0 Replies
Loading