Keywords: Link prediction, Inductive learning, Isolated nodes, Unsupervised pre-training, Generalizable, end-to-end training
TL;DR: Two-shot learning approach involving unsupervised pre-training of node attributes on a corpus larger than the observed graph improves inductive link prediction on isolated nodes.
Abstract: Link prediction is a vital task in graph machine learning, involving the anticipation of connections between entities within a network. In the realm of drug discovery, link prediction takes the form of forecasting interactions between drugs and target genes. Likewise, in recommender systems, link prediction entails suggesting items to users. In temporal graphs, link prediction ranges from friendship recommendations to introducing new devices in wireless networks and dynamic routing. However, a prevailing challenge in link prediction lies in the reliance on topological neighborhoods and the lack of informative node metadata for making predictions. Consequently, predictions for nodes with low degrees, and especially for newly introduced nodes with no neighborhood data, tend to be inaccurate and misleading. State-of-the-art models frequently fall short when tasked with predicting interactions between a novel drug and an unexplored disease target or suggesting a new product to a recently onboarded user. In temporal graphs, the link prediction models often misplace a newly introduced entity in the evolving network. This paper delves into the issue of observation bias related to the inequity of data availability for different entities in a network, unavailability of informative node metadata, and explores how contemporary models struggle when it comes to making inductive link predictions for low-degree and previously unseen isolated nodes. Additionally, we propose a non-end-to-end training approach harnessing informative node attributes generated by unsupervised pre-training on corpora different from and with significantly more entities than the observed graphs to enhance the overall generalizability of link prediction models.
Format: Long paper, up to 8 pages. If the reviewers recommend it to be changed to a short paper, I would prefer to withdraw my submission.
Submission Number: 41
Loading