Where Did the Gap Go? Reassessing the Long-Range Graph Benchmark

Published: 30 May 2024, Last Modified: 30 May 2024Accepted by TMLREveryoneRevisionsBibTeX
Abstract: The recent Long-Range Graph Benchmark (LRGB, Dwivedi et al. 2022) introduced a set of graph learning tasks strongly dependent on long-range interaction between vertices. Empirical evidence suggests that on these tasks Graph Transformers significantly outperform Message Passing GNNs (MPGNNs). In this paper, we carefully reevaluate multiple MPGNN baselines as well as the Graph Transformer GPS (Rampášek et al. 2022) on LRGB. Through a rigorous empirical analysis, we demonstrate that the reported performance gap is overestimated due to suboptimal hyperparameter choices. It is noteworthy that across multiple datasets the performance gap completely vanishes after basic hyperparameter optimization. In addition, we discuss the impact of lacking feature normalization for LRGB's vision datasets and highlight a spurious implementation of LRGB's link prediction metric. The principal aim of our paper is to establish a higher standard of empirical rigor within the graph machine learning community.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: We addressed the suggestions for improvements by the reviewers: * The introduction of LRGB has been extended for clarity. * The related work provides more details on the compared methods and their ability to capture long-range information * We added scatter plots to the appendix that show the spread of validation results observed during tuning, showing that GPS is not more stable than MPGNNs in this regard. * We extended the conclusion with a more thorough discussion of future directions for benchmarking graph transformers. The second revision just added an Acknowledgments section.
Code: https://github.com/toenshoff/LRGB
Supplementary Material: zip
Assigned Action Editor: ~bo_han2
Submission Number: 1927