- TL;DR: We develop a strategy for pre-training Graph Neural Networks (GNNs) and systematically study its effectiveness on multiple datasets, GNN architectures, and diverse downstream tasks.
- Abstract: Many domains in machine learning have datasets with a large number of related but different tasks. Those domains are challenging because task-specific labels are often scarce and test examples can be distributionally different from examples seen during training. An effective solution to these challenges is to pre-train a model on related tasks where data is abundant, and then fine-tune it on a downstream task of interest. While pre-training has been effective for improving many language and vision domains, pre-training on graph datasets remains an open question. Here, we develop a strategy for pre-training Graph Neural Networks (GNNs). Crucial to the success of our strategy is to pre-train an expressive GNN at the level of individual nodes as well as entire graphs. We systematically study different pre-training strategies on multiple datasets and find that when ad-hoc strategies are applied, pre-trained GNNs often exhibit negative transfer and perform worse than non-pre-trained GNNs on many downstream tasks. In contrast, our proposed strategy is effective and avoids negative transfer across downstream tasks, leading up to 11.7% absolute improvements in ROC-AUC over non-pre-trained models and achieving state of the art performance.
- Keywords: Pre-training, Transfer learning, Graph Neural Networks