Abstract: In commercial systems, the interaction between user behavior and advertising content forms a complex graph network that exhibits multi-scenario and multi-modal characteristics. Multi-scenario refers to the fact that ads are typically displayed in different scenarios on the same platform, such as Baidu App’s news feed and short video scenes. Across different scenarios, users’ interests have both commonalities and differences. Multi-modal refers to the fact that the influence of advertising content modality on users varies, and users’ sensitivity to different content modalities may also differ. We believe that the fundamental relationship between user behavior and advertising content may be multi-layered, influenced by the types of presenting content scenarios and the unique features of each modality. For example, in the news feed scenario, the importance of video titles may be different from that in the video stream scenario, where the visual content of the video may be more prominent. We propose a novel heterogeneous graph neural network method that integrates multi-scenario domains and multi-modal content features, which we call the Domain-Aware Multi-Modal graph model (DAMM-GCN). Our model achieves significant improvements in multiple scenarios compared with the traditional flat graph models such as Metapath2vec [1], GraphSAGE [2] and MMGCN [3]. Furthermore, our model has been deployed in the Baidu advertising system, obtaining 2.08% improvement on CPM (Cost Per Mille).
Loading