Keywords: Graph Neural Networks, Embedding Decomposition, Explainable AI
Abstract: Graph Neural Networks (GNNs) have achieved state-of-the-art performance on
tasks such as user-item interaction prediction in recommender systems, molecular
property classification, and credit risk scoring and fraud detection in financial
risk modeling. However, their opaque embedding mechanisms raise critical
concerns about transparency and trustworthiness. Existing explainability approaches
largely focus on identifying the nodes, edges, or subgraphs that influence
the model’s prediction but fail to disentangle how individual node features shape
learned embeddings. In this work, we propose a novel decomposition framework
that systematically attributes each embedding to original node and/or edge features.
Our method inverts GNN layers into contribution pathways, enabling finegrained
attribution across heterogeneous feature streams. We demonstrate our
framework on Graph Convolutional Networks (GCN) and Heterogeneous Graph-
SAGE (HinSAGE), evaluating on Cora and MovieLens datasets. Results show
that our approach enhances interpretability by tracing predictive embeddings to
semantically meaningful features. This work bridges structural explainability and
feature-level attribution, providing a principled foundation for trustworthy and actionable
GNN explanations.
Primary Area: interpretability and explainable AI
Submission Number: 20984
Loading