Keywords: Heterogeneous Information Networks, Heterogeneous Graph Neural Networks, Graph Representation Learning
Abstract: Heterogeneous Information Networks (HINs) provide a powerful framework for modeling multi-typed entities and relations, typically defined under a fixed schema. Yet, most research assumes this structure is given, overlooking the fact that alternative designs can emphasize different aspects of the data and substantially influence downstream performance.
As a theoretical foundation for such designs, we introduce the principle of entity-attribute duality: attributes can be atomized as entities with their associated relations, while entities can, in turn, serve as attributes of others. This principle motivates atomic HIN, a canonical representation that makes all modeling choices explicit and achieves maximal expressiveness.
Building on this foundation, we propose a systematic framework for task-specific schema refinement.
Within this framework, we demonstrate that widely used benchmarks correspond to heuristic refinements of the atomic HIN---often far from optimal.
Across eight datasets, refinement alone enables a simplified Relational GCN (sRGCN) to reach state-of-the-art performance on node- and link-level tasks, with further gains from advanced HGNNs. These results highlight schema design as a key dimension in heterogeneous graph modeling.
By releasing the atomic HINs, searched schemas, and refinement framework, we enable principled benchmarking and open the way for future work on schema-aware learning, automated structure discovery, and next-generation HGNNs.
Supplementary Material: zip
Primary Area: learning on graphs and other geometries & topologies
Submission Number: 17110
Loading