Self-supervised multi-hop heterogeneous hypergraph embedding with informative pooling for graph-level classification

Malik Khizar Hayat, Shan Xue, Jian Yang

Published: 2025, Last Modified: 19 Aug 2025Knowl. Inf. Syst. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In heterogeneous graph analysis, existing self-supervised learning (SSL) methods face several key challenges. Primarily, these approaches are tailored for node-level tasks and fail to effectively capture global graph-level features, a crucial aspect for comprehensive graph understanding. Furthermore, they predominantly rely on meta-path-based techniques to unravel graph structures, a process that can be computationally intensive and often intractable for complex networks. Another significant limitation is their inability to account for nonpairwise relationships, a common characteristic in real-world networks like protein-protein interaction and collaboration networks, limiting their effectiveness in graph-level learning where high-order connectivity is essential. To address these issues, we propose an innovative SSL framework for heterogeneous hypergraph embedding, expressly designed to enhance graph-level classification. Our framework introduces multi-hop attention in hypergraph convolution, a significant leap from existing attention mechanisms specifically for hypergraphs that primarily focus on immediate neighborhoods. This multi-hop approach allows for an expansive capture of relational structures, both near and far, uncovering intricate patterns integral to accurate graph-level classification. Complementing this, we implement an informative graph-level attentive pooling mechanism that surpasses traditional aggregation methods. It intelligently synthesizes features, taking into account their structural and semantic importance within the hypergraph, thereby preserving critical contextual information. Furthermore, we refine our contrastive learning approach and introduce targeted negative sampling strategies, creating a more robust learning environment that excels at discerning nuanced graph-level features. Rigorous evaluation against established graph kernels, graph neural networks, and graph pooling methods on real-world datasets demonstrates our model’s superior performance, validating its effectiveness in addressing the complexities inherent in heterogeneous graph-level classification.