Loss Transformation Invariance of the Damped Newton Methods

Alexander Shestakov; Sushil Bohara; Samuel Horváth; Martin Takáč; Slavomir Hanzely

Loss Transformation Invariance of the Damped Newton Methods

Alexander Shestakov, Sushil Bohara, Samuel Horváth, Martin Takáč, Slavomir Hanzely

04 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: optimization, second-order methods, damped newton method, invariance

TL;DR: We introduce the concept of loss transformation invariance and prove that it holds for the stepsized Newton method. This allows improving loss properties, enables transformation-induced stepsizes, and justifying unconventional Newton stepsizes.

Abstract: The Newton method is one of the most widely used second-order optimization techniques, valued for its conceptual simplicity and extremely fast local convergence. A key advantage is its invariance under affine transformations (e.g., choice of coordinate basis), which greatly facilitates implementation. However, the classical Newton method fails to converge when initialized far from the solution, motivating the development of various globalization techniques. In this work, we focus on step size damping, which, when appropriately scheduled, ensures fast global convergence while preserving both affine-invariance and superlinear local rates. Although highly effective in convex settings, existing algorithms offer limited guarantees for problems that are only nearly convex. To address this, we investigate loss transformations that convexify the objective. We show that Newton step size schedules are invariant under such transformations and that stepsize scheduling implicitly searches over the space of objective transformations. Our theoretical findings are further supported by comprehensive experimental validation.

Supplementary Material: zip

Primary Area: optimization

Submission Number: 2062

Loading