GFSE: A Foundational Model For Graph Structural Encoding

ICLR 2025 Conference Submission9032 Authors

27 Sept 2024 (modified: 13 Oct 2024)ICLR 2025 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: foundation model, graph representation learning
Abstract: Foundation models have recently shown remarkable promise by leveraging extensive pre-training on diverse datasets to acquire generalizable representations, which enable effective transfer to a wide range of downstream tasks. In the graph domain, however, most existing pre-training models are tailored to specific domains, primarily due to the inherent differences in semantic meanings of graph features across various contexts. Additionally, most existing models struggle to capture the rich topological complexity of graph structures, leading to inadequate exploration of the embedding space. To address these challenges, we propose a novel Graph Foundational Structural Encoder (GFSE) that identifies universal structural patterns, facilitating a unified feature embedding space suitable for diverse domains, including molecular structures, social networks, and citation networks. GFSE is the first cross-domain graph structural encoder pre-trained with multiple self-supervised learning objectives. Built on a Graph Transformer, GFSE incorporates attention mechanisms biased by graph structural information, allowing it to encode intricate multi-level and fine-grained topological features within complex graph structures. The pre-trained GFSE produces generic and theoretically expressive positional and structural encoding for graphs, which can be seamlessly integrated with various downstream graph feature encoders, including graph neural networks for graphs with vectorized features and Large Language Models for text-attributed graphs. Comprehensive experiments on synthetic and real-world datasets demonstrate GFSE's capability to significantly enhance the model's performance while requiring substantially less task-specific fine-tuning. Notably, GFSE boosts the performance by an average margin of 20.48% across eight real-world datasets, highlighting its potential as a powerful and adaptable foundational encoder for graph-structured data.
Primary Area: learning on graphs and other geometries & topologies
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 9032
Loading