Weight Anisotropy in Mean-Field Theory: Learning on Isotropic Data

Published: 29 May 2026, Last Modified: 29 May 2026HiLD at ICML 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: mean-field theory, feature learning, high-dimensional learning dynamics, input feature selection, weight anisotropy, sparse parity, multi-index models, stochastic gradient Langevin dynamics, automatic relevance determination, phase transitions, sample complexity
Abstract: Neural networks efficiently learn isotropic data distributions with low-dimensional target structure where fixed kernel limits fail. We trace this advantage to input feature selection (IFS): networks develop strong weight anisotropy along task-relevant coordinates. While standard Mean-Field (MF) theory captures the onset of feature learning, it tracks only first moments and thus misses IFS and underestimates post-transition generalisation. We introduce MF-ARD, augmenting MF with a single additional set of order parameters for coordinate-wise precisions. MF-ARD successfully captures the sharp generalisation transitions of finite-width networks.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 45
Loading