Globally Q-linear Gauss-Newton Method for Overparameterized Non-convex Matrix Sensing

Xixi Jia; Fangchen FENG; Deyu Meng; Defeng Sun

Globally Q-linear Gauss-Newton Method for Overparameterized Non-convex Matrix Sensing

Xixi Jia, Fangchen FENG, Deyu Meng, Defeng Sun

Published: 25 Sept 2024, Last Modified: 13 Jan 2025NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY-NC 4.0

Keywords: low-rank matrix sensing, non-convex optimization, Gaussian-Newton method

Abstract: This paper focuses on the optimization of overparameterized, non-convex low-rank matrix sensing (LRMS)—an essential component in contemporary statistics and machine learning. Recent years have witnessed significant breakthroughs in first-order methods, such as gradient descent, for tackling this non-convex optimization problem. However, the presence of numerous saddle points often prolongs the time required for gradient descent to overcome these obstacles. Moreover, overparameterization can markedly decelerate gradient descent methods, transitioning its convergence rate from linear to sub-linear. In this paper, we introduce an approximated Gauss-Newton (AGN) method for tackling the non-convex LRMS problem. Notably, AGN incurs a computational cost comparable to gradient descent per iteration but converges much faster without being slowed down by saddle points. We prove that, despite the non-convexity of the objective function, AGN achieves Q-linear convergence from random initialization to the global optimal solution. The global Q-linear convergence of AGN represents a substantial enhancement over the convergence of the existing methods for the overparameterized non-convex LRMS. The code for this paper is available at \url{https://github.com/hsijiaxidian/AGN}.

Primary Area: Optimization for deep networks

Submission Number: 4376

Loading