The Self-Consistent Theory of Neural Network Moments

TMLR Paper6545 Authors

18 Nov 2025 (modified: 11 Dec 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: This paper establishes a rigorous mathematical foundation for the statistical behavior of neural network parameter and gradient moments through self-consistent equations. We prove that the logarithmic moments exhibit a universal asymptotic decomposition governed by extremal statistics. This framework is extended to construct a joint partition function that unifies parameter and gradient statistics, revealing a topological phase distinction between states of correlated and uncorrelated extrema. The theory provides exact microscopic guarantees for finite networks while capturing emergent scaling behavior in large-scale systems.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Jeffrey_Pennington1
Submission Number: 6545
Loading