Attribute-Enhanced Similarity Ranking for Sparse Link Prediction

Zexi Huang; Joao Pedro Rodrigues Mattos; Mert Kosan; Arlei Lopes da Silva; Ambuj Singh

Attribute-Enhanced Similarity Ranking for Sparse Link Prediction

Zexi Huang, Joao Pedro Rodrigues Mattos, Mert Kosan, Arlei Lopes da Silva, Ambuj Singh

24 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: learning on graphs and other geometries & topologies

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Link Prediction, Graph Neural Networks, Graph Learning, Network Science

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: We propose Gelato, a similarity-based link-prediction method that applies graph learning, a ranking loss, and partitioning-based negative sampling.

Abstract: Link prediction is a fundamental problem in graph data. In its most realistic setting, the problem consists of predicting missing or future links between random pairs of nodes from the set of disconnected pairs. Graph Neural Networks (GNNs) have become the predominant framework for link prediction. GNN-based methods treat link prediction as a binary classification problem and handle the extreme class imbalance---real graphs are very sparse---by sampling (uniformly at random) a balanced number of disconnected pairs not only for training but also for evaluation. However, we show that the reported performance of GNNs for link prediction in the balanced setting does not translate to the more realistic imbalanced setting and that simpler topology-based approaches are often better at handling sparsity. These findings motivate Gelato, a similarity-based link-prediction method that applies (1) graph learning based on node attributes to enhance a topological heuristic, (2) a ranking loss for addressing class imbalance, and (3) a negative sampling scheme that efficiently selects hard training pairs via graph partitioning. Experiments show that Gelato is more accurate and faster than GNN-based alternatives.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 8672

Loading