Keywords: molecular representation learning, molecular property prediction, pre-training on graphs
TL;DR: We provide a comprehensive survey of pre-training Graph Neural Networks for molecular Representations.
Abstract: Recent years have witnessed remarkable advances in molecular representation learning using Graph Neural Networks (GNNs). To fully exploit the unlabeled molecular data, researchers first pre-train GNNs on large-scale molecular databases and then fine-tune these pre-trained Graph Models (GMs) in downstream tasks. The knowledge implicitly encoded in model parameters can benefit various downstream tasks and help to alleviate several fundamental challenges of molecular representation learning. In this paper, we provide a comprehensive survey of pre-trained GMs for molecular representations. We first briefly present the limitations of molecular graph representation learning and thus introduce the motivation for molecular graph pre-training. Next, we systematically categorize existing pre-trained GMs based on a taxonomy from four different perspectives including model architectures, pre-training strategies, tuning strategies, and applications. Finally, we outline several promising research directions that can serve as a guideline for future studies.
Track: Highlight Track