Rethinking the Effectiveness of Graph Classification Datasets in Benchmarks for Assessing GNNs

23 Sept 2023 (modified: 27 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Primary Area: datasets and benchmarks
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: graph classification benchmark, graph neural networks, effectiveness of dataset
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Graph classification benchmarks, vital for assessing and developing graph neural network (GNN) models, have recently been scrutinized, as simple methods like MLPs have demonstrated comparable performance on certain datasets. This leads to an important question: Do these benchmarks effectively distinguish the advancements of GNNs over other methodologies? If so, how do we quantitatively measure this effectiveness? In response, we propose an empirical protocol based on a fair benchmarking framework to investigate the performance discrepancy between simple methods and GNNs. We further propose a novel metric to quantify the effectiveness of a dataset by utilizing the performance gaps and considering dataset complexity. Through extensive testing across 16 real-world datasets, we found our metric to align with existing studies and intuitive assumptions. Finally, to explore the causes behind the low effectiveness, we investigated the relationship between intrinsic graph properties and task labels and developed a novel technique for generating more synthetic datasets that can precisely control these correlations. Our findings shed light on the current understanding of benchmark datasets, and our new platform backed by an effectiveness validation protocol could fuel the future evolution of graph classification benchmarks.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7184
Loading