Target-Embedding Autoencoder With Knowledge Distillation for Multi-Label Classification

Ying Ma; Xiaoyan Zou; Qizheng Pan; Ming Yan; Guoqi Li

Target-Embedding Autoencoder With Knowledge Distillation for Multi-Label Classification

Ying Ma, Xiaoyan Zou, Qizheng Pan, Ming Yan, Guoqi Li

Published: 01 Jan 2024, Last Modified: 13 Nov 2024IEEE Trans. Emerg. Top. Comput. Intell. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In the task of multi-label classification, it is a key challenge to determine the correlation between labels. One solution to this is the Target Embedding Autoencoder (TEA), but most TEA-based frameworks have numerous parameters, large models, and high complexity, which makes it difficult to deal with the problem of large-scale learning. To address this issue, we provide a Target Embedding Autoencoder framework based on Knowledge Distillation (KD-TEA) that compresses a Teacher model with large parameters into a small Student model through knowledge distillation. Specifically, KD-TEA transfers the dark knowledge learned from the Teacher model to the Student model. The dark knowledge can provide effective regularization to alleviate the over-fitting problem in the training process, thereby enhancing the generalization ability of the Student model, and better completing the multi-label task. In order to make the Student model learn the knowledge of the Teacher model directly, we improve the distillation loss: KD-TEA uses MSE loss instead of KL divergence loss to improve the performance of the model in multi-label tasks. Experiments on multiple datasets show that our KD-TEA framework is superior to the most advanced multi-label classification methods in both performance and efficiency.

Loading