Learning Language-grounded Concepts for Self-explainable Graph Neural Networks

Learning Language-grounded Concepts for Self-explainable Graph Neural Networks

ICLR 2026 Conference Submission20533 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Graph Neural Networks, Explainable AI, Concept Bottleneck Model

Abstract: We introduce Graph Concept Bottleneck (GCB) as a new paradigm for self-explainable Graph Neural Networks. GCB maps graphs into a concept space—a concept bottleneck—where each concept is a natural language phrase, and predictions are made based on these concepts. Unlike existing interpretable GNNs that primarily rely on subgraphs as explanations, the concept bottleneck provides a more human-understandable form of interpretation. To refine the concept space, we apply the information bottleneck principle to encourage the model to focus on causal concepts instead of spurious ones. This not only yields more compact and faithful explanations but also explicitly guides the model to think toward the correct decision. We empirically show that GCB achieves intrinsic interpretability with accuracy on par with black-box GNNs. Moreover, it delivers better performance under distribution shifts and data perturbations, demonstrating improved robustness and generalizability as a natural byproduct of concept-based reasoning.

Supplementary Material: zip

Primary Area: learning on graphs and other geometries & topologies

Submission Number: 20533

Loading