Understanding Nonlinear Implicit Bias via Region Counts in Input Space

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: One explanation for the strong generalization ability of neural networks is implicit bias. Yet, the definition and mechanism of implicit bias in non-linear contexts remains little understood. In this work, we propose to characterize implicit bias by the count of connected regions in the input space with the same predicted label. Compared with parameter-dependent metrics (e.g., norm or normalized margin), region count can be better adapted to nonlinear, overparameterized models, because it is determined by the function mapping and is invariant to reparametrization. Empirically, we found that small region counts align with geometrically simple decision boundaries and correlate well with good generalization performance. We also observe that good hyper-parameter choices such as larger learning rates and smaller batch sizes can induce small region counts. We further establish the theoretical connections and explain how larger learning rate can induce small region counts in neural networks.
Lay Summary: Neural networks often perform well even when they are much larger than the training data. One reason is called “implicit bias” — hidden preferences that shape how the network learns. In this work, we describe implicit bias by counting how many separate regions a model divides the input space into. Fewer regions often mean simpler decisions and better generalization. We show that training with large learning rates or small batch sizes naturally leads to fewer regions, helping explain why these settings often improve performance. Our method offers a simple, effective way to understand and predict how well a model will generalize.
Primary Area: Theory->Deep Learning
Keywords: implicit bias, region counts, non-linear neural network, generalization gap
Submission Number: 8322
Loading