Abstract: Real-world graphs are typically complex, exhibiting heterogeneity in the global structure, as well as strong heterophily within local neighborhoods. While a growing body of literature has revealed the limitations of graph neural networks (GNNs) in handling homogeneous graphs with heterophily, little work has been conducted on investigating the heterophily properties in the context of heterogeneous graphs. To bridge this research gap, we identify the heterophily in heterogeneous graphs using metapaths and propose two practical metrics to quantitatively describe the levels of heterophily. Our empirical investigations on real-world heterogeneous graphs have revealed that heterogeneous graph neural networks (HGNNs), which inherit many mechanisms from GNNs designed for homogeneous graphs, struggle to generalize to heterogeneous graphs with heterophily or low levels of homophily. To address the challenge, we present Hetero$^{2}$2Net, a heterophily-aware HGNN that incorporates masked metapath prediction and masked label prediction tasks to effectively and flexibly handle both homophilic and heterophilic heterogeneous graphs. We evaluate the performance of Hetero$^{2}$2Net on five real-world heterogeneous graph benchmarks with varying levels of heterophily. Experimental results demonstrate that Hetero$^{2}$2Net outperforms strong baselines in the semi-supervised node classification task. In particular, Hetero$^{2}$2Net scales to an industrial-scale commercial graph with 13 M nodes and 157 M edges, demonstrating its effectiveness in handling large and complex heterogeneous graphs.
External IDs:dblp:journals/pami/LiWZWZCZ25
Loading