DVIB: Towards Robust Multimodal Recommender Systems via Variational Information Bottleneck Distillation
Track: User modeling, personalization and recommendation
Keywords: Multimodal Recommender System, Robust Training, Variational Information Bottleneck, Feature Distillation
Abstract: In multimodal recommender systems (MRS), integrating various modalities helps to model user preferences and item characteristics more accurately, thereby assisting users in discovering items that match their interests.
Although the introduction of multimodal information offers opportunities for performance improvement, it will increase the risks of inherent noise and information redundancy, posing challenges to the robustness of MRS.
Many existing methods typically address these two issues separately either by introducing perturbations at the model input for robust training to handle noise or by designing complex network structures to filter out redundant information. In contrast, we propose the DVIB framework to simultaneously address both issues in a simple manner. We found that moving the perturbations from the input layer to the hidden layer, combined with feature self-distillation, can mitigate noise and handle information redundancy without altering the original network architecture. Additionally, we also provide theoretical evidence for the effectiveness of DVIB, demonstrating that the framework not only explicitly enhances the robustness of model training but also implicitly exhibits an information bottleneck effect, which effectively reduces redundant information during multimodal fusion and improves feature extraction quality. Extensive experiments show that DVIB consistently improves the performance of MRS across different datasets and model settings, and it can complement existing robust training methods, representing a promising new paradigm in MRS. The code and all models will be released online.
Submission Number: 333
Loading