Stability and Generalization Capability of Subgraph Reasoning Models for Inductive Knowledge Graph Completion

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-NC-SA 4.0
TL;DR: We analyze the relationship between the stability and generalization capability of subgraph reasoning models for inductive knowledge graph completion by relating the generalization bound to the stability under the Relational Tree Mover’s Distance.
Abstract: Inductive knowledge graph completion aims to predict missing triplets in an incomplete knowledge graph that differs from the one observed during training. While subgraph reasoning models have demonstrated empirical success in this task, their theoretical properties, such as stability and generalization capability, remain unexplored. In this work, we present the first theoretical analysis of the relationship between the stability and the generalization capability for subgraph reasoning models. Specifically, we define stability as the degree of consistency in a subgraph reasoning model's outputs in response to differences in input subgraphs and introduce the Relational Tree Mover’s Distance as a metric to quantify the differences between the subgraphs. We then show that the generalization capability of subgraph reasoning models, defined as the discrepancy between the performance on training data and test data, is proportional to their stability. Furthermore, we empirically analyze the impact of stability on generalization capability using real-world datasets, validating our theoretical findings.
Lay Summary: Knowledge graphs (KGs) represent real-world facts using triplets, but they often include missing triplets. Subgraph reasoning models predict missing triplets using the subgraph around each triplet. These models are useful even in inductive Knowledge Graph Completion (KGC), where a KG that appears during inference contains new entities. However, their theoretical properties, such as the relationship between stability and generalization capability, remain unexplored. We analyze stability which is the degree of output consistency with respect to the change of the input subgraph, measured by the Relational Tree Mover's Distance. The generalization capability is measured by the generalization bound which is the discrepancy between performance on training and test data. We theoretically prove that more stable subgraph reasoning models tend to exhibit a higher generalization capability. We validate our theoretical findings on real-world inductive KGC benchmarks. Our analysis highlights the importance of designing stable subgraph reasoning models to enhance generalization capability in inductive KGC.
Primary Area: Deep Learning->Graph Neural Networks
Keywords: Subgraph Reasoning, Inductive Knowledge Graph Completion, Stability, Generalization Capability, Link Prediction
Submission Number: 11375
Loading