Synthetic Graph Generation to Benchmark Graph LearningDownload PDF

Published: 22 Nov 2022, Last Modified: 22 Oct 2023NeurIPS 2022 GLFrontiers WorkshopReaders: Everyone
Keywords: graphs, graph neural networks, benchmark
TL;DR: Synthetic graphs allow for scientific benchmarks of GNNs.
Abstract: Despite advances in the field of Graph Neural Networks (GNNs), only a small number (~5) of datasets are currently used to evaluate new models. This continued reliance on a handful of datasets provides minimal insight into the performance differences between models, and is especially challenging for industrial practitioners who have datasets which are very different from academic benchmarks. In this work we introduce GraphWorld, a novel methodology and system for benchmarking GNN models on an arbitrarily-large population of synthetic graphs for any conceivable GNN task. GraphWorld allows a user to efficiently generate a \emph{world} with millions of statistically diverse datasets. It is accessible, scalable, and easy to use. GraphWorld can be run on a single machine without specialized hardware, or it can be easily scaled up to run on arbitrary clusters or cloud frameworks. Using GraphWorld, a user has fine-grained control over graph generator parameters, and can benchmark arbitrary GNN models. We present insights from GraphWorld experiments on the performance of thirteen GNN models and baselines over millions of benchmark datasets. We show that GraphWorld efficiently explores regions of benchmark dataset space uncovered by standard benchmarks, revealing comparisons between models that have not been historically obtainable. Using GraphWorld, we also are able to study in-detail the relationship between graph properties and task performance metrics, which is nearly impossible with the classic collection of real-world benchmarks.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2204.01376/code)
1 Reply

Loading