Interpretable Graph Embeddings: Feature-Level Decomposition for Trustworthy Graph Neural Networks

19 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Graph Neural Networks, Embedding Decomposition, Explainable AI
Abstract: Graph Neural Networks (GNNs) have achieved state-of-the-art performance on tasks such as user-item interaction prediction in recommender systems, molecular property classification, and credit risk scoring and fraud detection in financial risk modeling. However, their opaque embedding mechanisms raise critical concerns about transparency and trustworthiness. Existing explainability approaches largely focus on identifying the nodes, edges, or subgraphs that influence the model's prediction but fail to disentangle how individual node features shape learned embeddings. In this work, we propose a novel decomposition framework that systematically attributes each embedding to original node and/or edge features. We qualitatively demonstrate the framework on Graph Convolutional Networks (GCN) and Heterogeneous GraphSAGE (HinSAGE) using Cora and MovieLens, and quantitatively benchmark against widely adopted baselines across multiple datasets. Results indicate that our approach improves interpretability by revealing how node features contribute to individual graph embeddings and clarifying the role of neighborhood aggregation in shaping predictions. This work connects structural explainability and feature-level attribution, providing a principled foundation for trustworthy and actionable GNN explanations.
Supplementary Material: zip
Primary Area: interpretability and explainable AI
Submission Number: 20984
Loading