Keywords: Graph Learning, Transformers, Random walks, Pre-Training
TL;DR: We propose strategies for self-supervised pre-training of Transformers via random walks, enabling effective adaptation across diverse domains and downstream tasks at the node, link, and graph levels of text-attributed graphs.
Abstract: Pre-training large-scale models with diverse data using the Transformer architecture has driven significant advances in natural language understanding. Motivated by this success, we explore pre-training strategies for graph representation learning that leverage the flexibility of Transformers. A key challenge is enabling a sequence-based Transformer to effectively encode graphs of varying sizes and from diverse domains. To address this challenge, we represent nodes as collections of random walks, allowing the Transformer to learn node embeddings from sequential contexts. We provide theoretical analysis on the expressive capacity of this representation for distinguishing graph structures. We also introduce a novel context prediction loss tailored to random walks. Empirically, we show that the proposed pre-training strategy can be adapted to various downstream graph tasks, highlighting its promise for processing and reasoning with graph-structured data.
Primary Area: learning on graphs and other geometries & topologies
Submission Number: 15481
Loading