Multi-Modal Knowledge Distillation for Recommendation with Prompt-Tuning

Published: 23 Jan 2024, Last Modified: 23 May 2024TheWebConf24EveryoneRevisionsBibTeX
Keywords: Recommendation System, Knowledge Distillation, Prompt-Tuning, Multi-model
TL;DR: Multi-Modal Knowledge Distillation for Recommendation with Prompt-Tuning
Abstract: Multimedia online platforms, such as Amazon and TikTok, have greatly benefited from the incorporation of multimedia content (e.g., visual, textual, and acoustic modalities), into their personal recommender systems. These modalities provide intuitive semantics that facilitate modality-aware user preference modeling. However, there are two key challenges in multi-modal recommender that have not yet been well addressed: i) The introduction of multi-modal encoders with a large number of additional parameters causes overfitting, given high-dimensional multi-model features provided by extractors (e.g., ViT, BERT). ii) As side information, media content inevitably introduces inaccuracies and redundancies, which skew the modality-interaction dependency from reflecting their true relationships. To tackle these problems, we propose a Multimedia Knowledge Distillation Recommendation framework with Prompt-Tuning, aimed at obtaining a lightweight yet powerful inference model. Specifically, MMKD distill edge relationship (Collaborative KD, Modality-aware KD) and node content (Modality-aware Feature KD) from cumbersome teacher to relieve students from the need for additional feature reduction layers while maintaining accuracy. Besides, we introduce soft prompt-tuning to enable adaptive quality knowledge distillation as well as enhance the overfitting teacher. This approach allows MMKD to perform student task-adaptive distillation and simultaneously bridge the semantic gap between multi-modal content and collaborative signals. Additionally, to adjust the impact of inaccuracies in multimedia data, a disentangled multi-model list-wise distillation is developed with modality-aware re-weighting mechanism. Extensive experiments on real-world datasets demonstrate the superiority of our method over state-of-the-art recommendation techniques. Further evaluation shows promising model efficiency and component-wise effectiveness.
Track: User Modeling and Recommendation
Submission Guidelines Scope: Yes
Submission Guidelines Blind: Yes
Submission Guidelines Format: Yes
Submission Guidelines Limit: Yes
Submission Guidelines Authorship: Yes
Student Author: Yes
Submission Number: 285
Loading