Abstract: Whether it is an e-commerce platform or a short video platform, the effective use of multi-modal data plays an important role in the recommendation system. More and more researchers are exploring how to effectively use multimodal signals to entice more users to buy goods or watch short videos. Some studies have added multimodal features as side information to the model and achieved certain results. In practice, the purchase behavior of users mainly depends on some subjective intentions of users. However, it is difficult for neural networks to effectively process noise information and extract high-level intention information. To investigate the benefits of latent intentions and leverage them effectively for recommendation, we propose a Multimodal-aware Multi-intention Learning method for recommendation (MMIL). Specifically, we establish the relationship between intention and recommendation objective based on probability formula, and propose a multi-intention recommendation optimization objective which can avoid intention overfitting. We then construct an intent representation learner to learn accurate multiple intent representations. Further, considering the close relationship between user intent and multimodal signals, we introduce modal attention mechanisms to learn modal perceived intent representations. In addition, we design a multi-intention comparison module to assist the learning of multiple intention representations. On three real-world data sets, the proposed MMIL method outperforms other advanced methods. The effectiveness of intention modeling and intention contrast module is verified by comprehensive experiments.
Primary Subject Area: [Engagement] Multimedia Search and Recommendation
Secondary Subject Area: [Experience] Multimedia Applications
Relevance To Conference: This work is very relevant to multimodality and will advance the field of multimodality and multimodality recommendation system.
Submission Number: 3831
Loading