Cyclic Data Distillation Semi-Supervised Learning for Multi-Modal Emotion Recognition

Published: 2025, Last Modified: 13 Jan 2026IEEE Trans. Knowl. Data Eng. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Multi-modal emotion recognition (MER) integrates multi-modal signals to help computers comprehensively understand human emotions, which is a crucial technology in human-computer interactions. However, the amount of labeled multi-modal emotion data is small and limits MER performance due to its expensive manual annotations. Meanwhile, semi-supervised learning (SSL) methods improving MER models with enormous unlabeled data suffer from confirmation bias, resulting in biased data distribution. To tackle these challenges, this paper proposes a cyclic data distillation semi-supervised learning (CDD-SSL) for MER tasks. CDD-SSL leverages multiple pre-trained unimodal teacher models and confidence-boosting pseudo-labelling (CBPL) to boost the confidence of multi-modal ensemble outputs and distill reliable and class-representative data from numerous unlabeled data. It then utilizes reliable and less-biased data to train a multi-modal student model and provides feedback to update all unimodal teacher models. CDD-SSL is a cyclic teacher-student framework with a feedback mechanism that gradually mitigates confirmation bias and obtains an effective MER model. Experimental results on four benchmark datasets demonstrate that CDD-SSL achieves superior performance over both the semi-supervised methods and the state-of-the-art fully-supervised models in MER tasks.
Loading