Bag of Features: New Baselines for GNNs for Link Prediction

20 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: graph representation learning, link prediction, feature engineering, graph neural networks
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: Motivated by recent theoretical studies questioning the expressivity of GNNs, we compare SOTA GNN models with simple feature engineering methods in link prediction problem, and obtain competitive results in benchmark datasets.
Abstract: Graph Neural Networks (GNNs) have brought a significant transformation in the realm of graph representation learning. They achieve this by employing a neighborhood aggregation approach, wherein a node's representation vector is iteratively calculated by aggregating and modifying the corresponding vectors of its neighboring nodes. Despite GNNs demonstrating superior performance in various domains over the last ten years, recent theoretical studies have raised concerns about their expressive capabilities, where they show that GNN models yield results comparable to the well-established Weisfeiler-Lehman algorithm. In this paper, driven by this motivation, we compare the performance of current GNN models with conventional feature extraction methods in the context of link prediction. Our experiments reveal that when applied to standard feature sets derived from node neighborhoods and node features, standard machine learning (ML) models deliver highly competitive results, even when pitted against cutting-edge GNN models. This holds true across both small and large benchmark datasets, including those from the Open Graph Benchmark (OGB). Our empirical findings corroborate the previously mentioned theoretical observations and imply that there exists ample room for enhancement in current GNN models to reach their potential.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: pdf
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2621
Loading