A Multilevel Low-Rank Newton Method with Super-linear Convergence Rate and its Application to Non-convex Problems

Nick Tsipinakis; Panagiotis Tigas; Panos Parpas

A Multilevel Low-Rank Newton Method with Super-linear Convergence Rate and its Application to Non-convex Problems

Nick Tsipinakis, Panagiotis Tigas, Panos Parpas

Published: 20 Feb 2026, Last Modified: 20 Feb 2026Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Second-order methods can address the shortcomings of first-order methods for the optimization of large-scale machine learning models. However, second-order methods have significantly higher computational costs associated with the computation of second-order information. Subspace methods that are based on randomization have addressed some of these computational costs as they compute search directions in lower dimensions. Even though super-linear convergence rates have been empirically observed, it has not been possible to rigorously show that these variants of second-order methods can indeed achieve such fast rates. Also, it is not clear whether subspace methods are efficient for non-convex settings. To address these shortcomings, we develop a link between multigrid optimization methods and low-rank Newton methods that enables us to prove the super-linear rates of stochastic low-rank Newton methods rigorously. Our method does not require any computations in the original model dimension. We further propose a truncated version of the method that is capable of solving high-dimensional non-convex problems. Preliminary numerical experiments show that our method has a better escape rate from saddle points compared to the state-of-the-art first-order methods and thus returns lower training errors.

Submission Length: Long submission (more than 12 pages of main content)

Code: https://github.com/Ntsip/SigmaSVD

Assigned Action Editor: ~Stephen_Becker1

Submission Number: 6223

Loading