Keywords: map equation, community-based, node similarity, representation learning, compression, link prediction
Abstract: Node similarity scores are a foundation for machine learning in graphs for clustering, node classification, anomaly detection, and link prediction with applications in biological systems, information networks, and recommender systems. Recent works on link prediction use vector space embeddings to calculate node similarities in undirected networks with good performance. Still, they have several disadvantages: limited interpretability, need for hyperparameter tuning, manual model fitting through dimensionality reduction, and poor performance from symmetric similarities in directed link prediction. We propose MapSim, an information-theoretic measure to assess node similarities based on modular compression of network flows. Unlike vector space embeddings, MapSim represents nodes in a discrete, non-metric space of communities and yields asymmetric similarities in an unsupervised fashion. We compare MapSim on a link prediction task to popular embedding-based algorithms across 47 networks and find that MapSim's average performance across all networks is more than 7% higher than its closest competitor, outperforming all embedding methods in 11 of the 47 networks. Our method demonstrates the potential of compression-based approaches in graph representation learning, with promising applications in other graph learning tasks.
Type Of Submission: Full paper proceedings track submission (max 9 main pages).
TL;DR: We develop an information-theoretic, community-based node similarity measure that can be used for link prediction and other learning tasks on graphs.
PDF File: pdf
Agreement: Check this if you are okay with being contacted to participate in an anonymous survey.
Type Of Submission: Full paper proceedings track submission.
Poster: png
Poster Preview: png
5 Replies
Loading