Bridging the Gap: Teacher-Assisted Wasserstein Knowledge Distillation for Efficient Multi-Modal Recommendation

Published: 29 Jan 2025, Last Modified: 29 Jan 2025WWW 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: User modeling, personalization and recommendation
Keywords: Multi-modal Recommendation, Knowledge Distillation, Optimal Transport, Wasserstein Distance
TL;DR: We present a novel teacher-assisted Wassertein knowledge distillation framework to compress a cumbersome multi-modal recommender system into an efficient student model.
Abstract: Multi-modal recommender systems (MMRecs) leverage diverse modalities to deliver personalized recommendations, yet they often struggle with efficiency due to the large size of modality encoders and the complexity of fusing high-dimensional features. To address the efficiency issue, a promising solution is to compress a cumbersome MMRec into a lightweight ID-based Multi-Layer Perceptron-based Recommender system (MLPRec) through Knowledge Distillation (KD). Despite effectiveness, this approach overlooks the significant gap between the complex teacher MMRec and the lightweight, ID-based student MLPRec, which differ significantly in size, architecture, and input modalities, leading to ineffective knowledge transfer and suboptimal student performance. To bridge this gap, we propose TARec, a novel teacher-assisted Wasserstein Knowledge Distillation framework for compressing MMRecs into an efficient MLPRec. TARec introduces: (i) a two-staged KD process using an intermediate Teacher Assistant (TA) model to bridge the gap between teacher and student, facilitating smoother knowledge transfer; (ii) logit-level KD using the Wasserstein Distance as metric, replacing the conventional KL divergence to ensure stable gradient flow even with significant teacher-student gaps; and (iii) embedding-level contrastive KD to further distill high-quality embedding-level knowledge from teacher. Extensive experiments on real-world datasets verify the effectiveness of TARec, demonstrating that TARec significantly outperforms the state-of-the-art MMRecs while reducing computational costs. Our anonymous code is available at: https://anonymous.4open.science/r/TARec-0980/.
Submission Number: 380
Loading