GraphGT: Machine Learning Datasets for Graph Generation and TransformationDownload PDF

Published: 22 Oct 2021, Last Modified: 05 May 2023NeurIPS-AI4Science PosterReaders: Everyone
Keywords: Graph Generation, Graph Transformation, Machine Learning, Datasets, Benchmarks
TL;DR: A machine learning dataset collection for graph generation and graph transformation
Abstract: Graph generation, which learns from known graphs and discovers novel graphs, has great potential in numerous research topics like drug design and mobility synthesis and is one of the fastest-growing domains recently due to its promise for discovering new knowledge. Though many benchmark datasets have emerged in the domain of graph representation learning, the real-world datasets for graph generation problem are much fewer and limited to a small number of areas such as molecules and citation networks. To fill the gap, we introduce GraphGT, a large dataset collection for graph generation problem in machine learning, which contains 36 datasets from 9 domains across 6 subjects. To assist the researchers with better explorations of the datasets, we provide a systemic review and classification of the datasets from various views including research tasks, graph types, and application domains. In addition, GraphGT provides an easy-to-use graph generation pipeline that simplifies the process for graph data loading, experimental setup, model evaluation. The community can query and access datasets of interest according to a specific domain, task, or type of graph. GraphGT will be regularly updated and welcome inputs from the community. GraphGT is publicly available at \url{} and can also be accessed via an open Python library.
Track: Original Research Track
1 Reply