The Graphon Limit Hypothesis: Understanding Neural Network Pruning via Infinite Width Analysis

Hoang Pham; The-Anh Ta; Tom Jacobs; Rebekka Burkholz; Long Tran-Thanh

The Graphon Limit Hypothesis: Understanding Neural Network Pruning via Infinite Width Analysis

Hoang Pham, The-Anh Ta, Tom Jacobs, Rebekka Burkholz, Long Tran-Thanh

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 spotlightEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Pruning Network, Graphon, Neural Tangent Kernel

TL;DR: We study the training dynamic of sparse neural networks through the lens of Graphon where masks of pruning methods converge to graphons as networks' width approach infinite

Abstract: Sparse neural networks promise efficiency, yet training them effectively remains a fundamental challenge. Despite advances in pruning methods that create sparse architectures, understanding why some sparse structures are better trainable than others with the same level of sparsity remains poorly understood. Aiming to develop a systematic approach to this fundamental problem, we propose a novel theoretical framework based on the theory of graph limits, particularly graphons, that characterizes sparse neural networks in the infinite-width regime. Our key insight is that connectivity patterns of sparse neural networks induced by pruning methods converge to specific graphons as networks' width tends to infinity, which encodes implicit structural biases of different pruning methods. We postulate the *Graphon Limit Hypothesis* and provide empirical evidence to support it. Leveraging this graphon representation, we derive a *Graphon Neural Tangent Kernel (Graphon NTK)* to study the training dynamics of sparse networks in the infinite width limit. Graphon NTK provides a general framework for the theoretical analysis of sparse networks. We empirically show that the spectral analysis of Graphon NTK correlates with observed training dynamics of sparse networks, explaining the varying convergence behaviours of different pruning methods. Our framework provides theoretical insights into the impact of connectivity patterns on the trainability of various sparse network architectures.

Supplementary Material: zip

Primary Area: Theory (e.g., control theory, learning theory, algorithmic game theory)

Submission Number: 17600

Loading