EKD: Effective Knowledge Distillation for Few-Shot Sentiment Analysis

Published: 2024, Last Modified: 07 Jan 2026ICANN (7) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In the task of few-shot sentiment analysis, existing distillation methods, often referred to as mimicry-based distillation, mainly focus on aligning the hidden states or the attention matrices of the teacher and the student model. In scenarios with very limited data, merely mimicking a teacher model’s complex output representations can be insufficient for the student model to capture the semantic information required to construct these representations. To solve this problem, we propose (i) a construction-based distillation method namely Effective Knowledge Distillation (EKD), and (ii) a best-performing interleaved layers distillation strategy for EKD. On several publicly available small Chinese datasets, our construction-based EKD method is shown to achieve improvement in accuracy compared to the mimicry-based distillation method. All the code and data have been made publicly available on the anonymous GitHub platform (https://anonymous.4open.science/r/MAKD-275D).
Loading