Robust Self-supervised Learning in Heterogeneous Graph Based on Feature-Topology Balancing

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Heterogeneous Graph, Knowledge graph, Self-supervised learning
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: A novel robust self-supervised learning framework for heterogeneous graphs, which is trained by striking a balance between graph topology and node features for the first time.
Abstract: In recent years, graph neural network (GNN) based self-supervised learning in heterogeneous information networks (HINs) has gathered considerable attention. Most of the past studies followed a message passing approach where the features of a central node are updated based on the features of its neighboring nodes. Since these methods depend on informative graph topology and node features, their performance significantly deteriorates when there is an issue in one factor. Moreover, since real-world HINs are highly noisy and validating the importance of attributes is challenging, it is rare to find cases where both the graph topology and node features are of good quality. To address this problem, we make the first model that can explicitly separate the graph topology and features in the heterogeneous graph by proposing the novel framework BFTNet (robust self-supervised heterogeneous graph learning using the Balance between node Features and graph Topology). BFTNet employs a knowledge graph embedding module focusing on global graph topology and a contrastive learning module dedicated to learning node features. Thanks to the novel structure that handles graph topology and node features separately, BFTNet can assign higher importance to one factor, thereby allowing it to effectively respond to skewed datasets in real-world situations. Moreover, BFTNet can improve performance by designing the optimal module suited for learning the topology and features, without sacrificing the performance of one modality to reflect the characteristics of the other modality. Lastly, BFTNet implemented a novel graph conversion scheme and representation fusion method to ensure that the representation of topology and features are effectively learned and integrated. The self-supervised learning performance of BFTNet is verified by extensive experiments on four real-world benchmark datasets, and the robustness of BFTNet is demonstrated with the experiments on noisy datasets. The source code of BFTNet will be available in the final version.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4881
Loading