Graph-Oriented Cross-Modality Diffusion for Multimedia Recommendation

Published: 2025, Last Modified: 21 Jan 2026ADMA (1) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Multimedia recommender systems have gained significant attention with the proliferation of multimedia-sharing platforms. While existing approaches primarily focus on modeling user-item bipartite graphs enhanced with multimodal features, they often overlook the rich structural information embedded in the cross-modality item-item graph. In this paper, we introduce the Graph-oriented cross-modality diffusion for multimedia Recommendation (GoodRec), a novel framework that excavates high-order relations between the cross-modality item-item graph for multimedia recommendations. Specifically, we first conceptualize a unified multi-modality item-item graph as a multivariate heat diffusion system, with an effective energy function to guide both intra-modality and inter-modality diffusion toward consistent representation learning. Then, we develop an enhanced multimedia recommendation module that constructs modality-specific graphs from diffusion-refined representations and employs adversarial mechanisms to strengthen user-item interactions. Extensive experiments on three real-world multimedia datasets demonstrate that GoodRec consistently outperforms state-of-the-art baselines, confirming the effectiveness of excavating high-order relations between cross-modality graph structure via diffusion.
Loading