Abstract: With the rapid development of smart TV, TV recommendation is attracting more and more users. TV users usually distribute in multiple regions with different cultures and hence have diverse TV program preferences. From the perspective of engineering practice and performance improvement, it's very essential to model users from multiple regions with one single model. In previous work, Multi-gate Mixture-of-Expert (MMoE) has been widely adopted in multi-task and multi-domain recommendation scenarios. In practice, however, we first observe the embeddings generated by experts tend to be homogeneous which may result in high semantic similarities among embeddings that reduce the capability of Multi-gate Mixture-of-Expert (MMoE) model. Secondly, we also find there are lots of commonalities and differences between multiple regions regarding user preferences. Therefore, it's meaningful to model the complicated relationships between regions. In this paper, we first introduce contrastive learning to overcome the expert representation degeneration problem. The embeddings of two augmented samples generated by the same experts are pushed closer to enhance the alignment, and the embeddings of the same samples generated by different experts are pushed away in vector space to improve uniformity. Then we propose a Graph-based Gating Mechanism to empower typical Multi-gate Mixture-of-Experts. Graph-based MMoE is able to recognize the commonalities and differences among multiple regions by introducing a Graph Neural Network (GNN) with region similarity prior. We name our model Multi-gate Mixture-of-Contrastive-Experts model with Graph-based Gating Mechanism (MMoCEG). Extensive offline experiments and online A/B tests on a commercial TV service provider over 100 million users and 2.3 million items demonstrate the efficacy of MMoCEG compared to the existing models.
Loading