Beyond Real-world Benchmark Datasets: An Empirical Study of Node Classification with GNNs

Seiji Maekawa; Koki Noda; Yuya Sasaki; Makoto Onizuka

Beyond Real-world Benchmark Datasets: An Empirical Study of Node Classification with GNNs

Seiji Maekawa, Koki Noda, Yuya Sasaki, Makoto Onizuka

Published: 17 Sept 2022, Last Modified: 20 Apr 2025NeurIPS 2022 Datasets and Benchmarks Readers: Everyone

Keywords: graph neural networks, classification, heterophily, synthetic graphs

Abstract: Graph Neural Networks (GNNs) have achieved great success on a node classification task. Despite the broad interest in developing and evaluating GNNs, they have been assessed with limited benchmark datasets. As a result, the existing evaluation of GNNs lacks fine-grained analysis from various characteristics of graphs. Motivated by this, we conduct extensive experiments with a synthetic graph generator that can generate graphs having controlled characteristics for fine-grained analysis. Our empirical studies clarify the strengths and weaknesses of GNNs from four major characteristics of real-world graphs with class labels of nodes, i.e., 1) class size distributions (balanced vs. imbalanced), 2) edge connection proportions between classes (homophilic vs. heterophilic), 3) attribute values (biased vs. random), and 4) graph sizes (small vs. large). In addition, to foster future research on GNNs, we publicly release our codebase that allows users to evaluate various GNNs with various graphs. We hope this work offers interesting insights for future research.

Author Statement: Yes

TL;DR: We empirically study the performance of GNNs with various synthetic graphs by synthetically changing one or a few target characteristic(s) of graphs while keeping other characteristics fixed.

URL: https://github.com/seijimaekawa/empirical-study-of-GNNs

License: MIT License

Supplementary Material: pdf

Contribution Process Agreement: Yes

In Person Attendance: Yes

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 7 code implementations](https://www.catalyzex.com/paper/beyond-real-world-benchmark-datasets-an/code)

28 Replies

Loading