Hyperbolic Aware Minimization: Implicit Bias for Sparsity

ICLR 2026 Conference Submission4837 Authors

Published: 26 Jan 2026, Last Modified: 26 Jan 2026ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Sparsity, Implicit bias, Sign flip, Exponential update, Training dynamics, Bregman function
TL;DR: We propose Hyperbolic Aware Minimization (HAM), a method that improves sparse training by preserving the benefits of hyperbolic implicit bias while avoiding slowdown caused by the vanishing inverse metric for parameter updates.
Abstract: Understanding the implicit bias of optimization algorithms is key to explaining and improving the generalization of deep models. The hyperbolic implicit bias induced by pointwise overparameterization promotes sparsity, but also yields a small inverse Riemannian metric near zero, slowing down parameter movement and impeding meaningful parameter sign flips. To overcome this obstacle, we propose Hyperbolic Aware Minimization (HAM), which alternates a standard optimizer step with a lightweight hyperbolic mirror step. The mirror step incurs less compute and memory than pointwise overparameterization, reproduces its beneficial hyperbolic geometry for feature learning, and mitigates the small–inverse-metric bottleneck. Our characterization of the implicit bias in the context of underdetermined linear regression provides insights into the mechanism how HAM consistently increases performance --even in the case of dense training, as we demonstrate in experiments with standard vision benchmarks. HAM is especially effective in combination with different sparsification methods, advancing the state of the art.
Supplementary Material: zip
Primary Area: optimization
Submission Number: 4837
Loading