Abstract: Multimodal recommender systems (MRSs) exploit diverse content sources (e.g., text, images) to enhance recommendation accuracy. However, they still face two fundamental challenges: (1) effectively capturing users’ diverse interests, and (2) filtering out noisy signals from heterogeneous modalities. To address these issues, we propose DMIMRec, a framework for Disentangled Multi-Interest Modeling in multimodal recommendation. On the item side, we construct modality-specific item–item graphs and introduce a reconstruction-difference guided pruning strategy that evaluates each edge’s usefulness by checking how much the reconstruction quality changes when that edge is removed, thereby discarding connections that contribute little or introduce noise. On the user side, we perform clustering over interacted items to initialize multiinterest prototypes, then apply a triple disentanglement module that separates user representations into interest-invariant, effective interest-specific, and ineffective interest-specific components, ensuring clearer semantic boundaries and avoiding representation collapse. Experiments on three bench-mark datasets demonstrate that our methods achieve better performance on different metrics.
Loading