Seeing Beyond Noise: Joint Graph Structure Evaluation and Denoising for Multimodal Recommendation

Published: 01 Jan 2025, Last Modified: 02 Aug 2025AAAI 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Multimodal Recommendation Systems (MRSs) boost traditional user-item interaction-based methods by incorporating multimodal information. However, existing methods ignore the inherent noise brought by (1) noisy semantic priors in multimodal content, and (2) noisy user interactions in history records, therefore diminishing model performance. To fill this gap, we propose to denoise MRSs by jointly EValuating structure Effectiveness and mitigating Noisy links (EVEN). Firstly, for semantic prior noise in multimodal content, EVEN builds item homogeneous consistency and denoises it by evaluating behavior-driven confidence. Secondly, for noise in user interactions, EVEN updates user feedback by denoising observed interactions following implicit contribution evaluation of high-order representations. Thirdly, EVEN performs cross-modal alignment through self-guided structure learning, reinforcing task-specific inter-modal dependency modeling and cross-modal fusion. Through extensive experiments on three widely-used datasets, EVEN achieves an average improvement of 8.95% and 5.90% in recommendation accuracy compared with LGMRec and FREEDOM, respectively, without extending the total training time.
Loading