Keywords: subgraphs, graph generation, out of distribution, distribution shift, link prediction, graph neural networks
TL;DR: synthetic graph samples augment training dataset to improve LP performance in OOD scenarios
Abstract: Graphs Neural Networks (GNNs) demonstrate high-performance on link prediction
(LP) datasets, especially when the distribution of testing samples falls within the
dataset’s training distribution. However, GNNs suffer decreased performance
when evaluated on samples from outside their training distribution. In addition,
graph generative models (GGMs) show a pronounced ability to generate novel
output graphs. Despite this, the application of GGMs remains largely limited to
domain-specific tasks. To bridge this gap, we propose leveraging GGMs to produce
synthetic samples which extrapolate between training and testing distributions.
These synthetic samples are then used for fine-tuning GNNs to improve link
prediction performance in out-of-distribution (OOD) scenarios. We introduce a
theoretical perspective on this phenomena which is further verified empirically via
increased performance across synthetic and real-world OOD settings. We conduct
further analysis to investigate how inducing structural change within training
samples improves OOD performance, indicating promising new developments in
graph data augmentation on link structures.
Supplementary Material: zip
Primary Area: learning on graphs and other geometries & topologies
Submission Number: 16556
Loading