Explaining Node Embeddings

TMLR Paper2859 Authors

12 Jun 2024 (modified: 09 Jul 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Node embedding algorithms produce low-dimensional latent representations of nodes in a graph. These embeddings are often used for downstream tasks, such as node classification and link prediction. In this paper, we investigate the following two questions: (Q1) Can we explain each embedding dimension with human-understandable graph features (e.g. degree, clustering coefficient and PageRank). (Q2) How can we modify existing node embedding algorithms to produce embeddings that can be easily explained by human-understandable graph features? We find that the answer to Q1 is yes and introduce a new framework called XM (short for eXplain eMbedding) to answer Q2. A key aspect of XM involves minimizing the nuclear norm of the generated explanations. We show that by minimizing the nuclear norm, we minimize the lower bound on the entropy of the generated explanations. We test XM on a variety of real-world graphs and show that XM not only preserves the performance of existing node embedding methods, but also enhances their explainability.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Kuldeep_S._Meel2
Submission Number: 2859
Loading