Revisiting the Goldilocks Zone in Inhomogeneous Networks

Zacharie Garnier Cuchet; Sarath Chandar; Ekaterina Lobacheva

Revisiting the Goldilocks Zone in Inhomogeneous Networks

Zacharie Garnier Cuchet, Sarath Chandar, Ekaterina Lobacheva

Published: 09 Jun 2025, Last Modified: 12 Jul 2025HiLD at ICML 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Goldilocks Zone, Initialization, Trainability, Curvature Analysis, Architectural Inhomogeneity

TL;DR: We analyze the Goldilocks zone in inhomogeneous networks and show that softmax temperature scaling provides a more robust probe of curvature and trainability than weight scaling.

Abstract: We investigate how architectural inhomogeneities, such as biases, layer normalization, and residual connections, affect the curvature of the loss landscape at initialization and its link to trainability. We focus on the Goldilocks zone, a region in parameter space with excess positive curvature, previously associated with improved optimization in homogeneous networks. To extend this analysis, we compare two scaling strategies: weight scaling and softmax temperature scaling. Our results show that in networks with biases or residual connections, both strategies identify a Goldilocks zone aligned with better training. In contrast, layer normalization leads to lower or negative curvature, yet stable optimization, revealing a disconnect between curvature and trainability. Softmax temperature scaling behaves more consistently across models, making it a more robust probe. Overall, the Goldilocks zone remains relevant in inhomogeneous networks, but its geometry and predictive power depend on architectural choices, particularly normalization.

Student Paper: Yes

Submission Number: 61

Loading