Meta Learning-based Multimodal Recommendation with Adaptive User Modality-Aware Preference Integration

Zhenchao Wu, Hongteng Xu, Xu Chen

Published: 2025, Last Modified: 01 Mar 2026MMSP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Owing to the incorporation of multimodal content, multimodal recommender systems contribute to powerful representation learning and promote personalized recommendations. However, most existing methods directly mix various modality embeddings and ID embeddings by element-wise sum as final representations. This strategy hardly distinguishes the importance of different representations and entangles the ID and modality effects, and thus is unable to fully leverage multimodal features to facilitate personalized recommendations. To address this problem, we design a meta-weight net to learn the relevance between pairwise embeddings (including modality embeddings and ID embeddings) from users and items, respectively. Specifically, the proposed model first performs graph convolutional networks to generate the modality and ID representations for users and items. For each type of user representation, we calculate its relevance weights with all types of item representations with the designed meta-weight net. Afterward, we utilize these relevance weights to adaptively integrate the corresponding probability scores to calculate the final recommendation result. Moreover, we adopt contrastive learning to alleviate the data sparsity issue and introduce the KL divergence constraint to better align user preferences and multimodal content. Experiments on three public datasets demonstrate that the proposed approach performs better than the state-of-the-art methods.
Loading