Interpretable Graph Embeddings: Feature-Level Decomposition for Trustworthy Graph Neural Networks

Interpretable Graph Embeddings: Feature-Level Decomposition for Trustworthy Graph Neural Networks

ICLR 2026 Conference Submission20984 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Graph Neural Networks, Embedding Decomposition, Explainable AI

Abstract: Graph Neural Networks (GNNs) have achieved state-of-the-art performance on tasks such as user-item interaction prediction in recommender systems, molecular property classification, and credit risk scoring and fraud detection in financial risk modeling. However, their opaque embedding mechanisms raise critical concerns about transparency and trustworthiness. Existing explainability approaches largely focus on identifying the nodes, edges, or subgraphs that influence the model’s prediction but fail to disentangle how individual node features shape learned embeddings. In this work, we propose a novel decomposition framework that systematically attributes each embedding to original node and/or edge features. Our method inverts GNN layers into contribution pathways, enabling finegrained attribution across heterogeneous feature streams. We demonstrate our framework on Graph Convolutional Networks (GCN) and Heterogeneous Graph- SAGE (HinSAGE), evaluating on Cora and MovieLens datasets. Results show that our approach enhances interpretability by tracing predictive embeddings to semantically meaningful features. This work bridges structural explainability and feature-level attribution, providing a principled foundation for trustworthy and actionable GNN explanations.

Primary Area: interpretability and explainable AI

Submission Number: 20984

Loading