Covered Forest: Fine-grained generalization analysis of graph neural networks

Antonis Vasileiou; Ben Finkelshtein; Floris Geerts; Ron Levie; Christopher Morris

Covered Forest: Fine-grained generalization analysis of graph neural networks

Antonis Vasileiou, Ben Finkelshtein, Floris Geerts, Ron Levie, Christopher Morris

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 spotlightposterEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We provide tighter generalization bounds for MPNNs by considering the pseudometric geometry of MPNNs' feature space

Abstract: The expressive power of message-passing graph neural networks (MPNNs) is reasonably well understood, primarily through combinatorial techniques from graph isomorphism testing. However, MPNNs' generalization abilities---making meaningful predictions beyond the training set---remain less explored. Current generalization analyses often overlook graph structure, limit the focus to specific aggregation functions, and assume the impractical, hard-to-optimize $0$-$1$ loss function. Here, we extend recent advances in graph similarity theory to assess the influence of graph structure, aggregation, and loss functions on MPNNs' generalization abilities. Our empirical study supports our theoretical insights, improving our understanding of MPNNs' generalization properties.

Lay Summary: Machine learning models are most useful when they can make reliable predictions not just on the data they were trained on, but also on new, unseen data. In this project, we investigate this ability---called generalization---for graph neural networks (GNNs), a type of neural network designed to work with graph-structured data such as social networks or chemical molecules. We develop mathematical tools to estimate how many training examples are needed to ensure good performance on new data. Importantly, we also reveal how the graphs’ structure- how their nodes and connections are arranged- impacts the model’s ability to generalize. These insights help guide the design of more effective GNNs and the more efficient use of training data in practice.

Primary Area: Deep Learning->Graph Neural Networks

Keywords: MPNNs, generalization, bounds, theory, Weisfeiler, Leman, Lehman

Submission Number: 252

Loading