Efficient Over-parameterized Matrix Sensing via Alternating Preconditioned Gradient Descent

Zhiyu Liu; Zhi Han; Yandong Tang; Hai Zhang; Yao Wang

Efficient Over-parameterized Matrix Sensing via Alternating Preconditioned Gradient Descent

Zhiyu Liu, Zhi Han, Yandong Tang, Hai Zhang, Yao Wang

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: matrix sensing, over-parameterization, low rank matrix recovery, gradient descent

TL;DR: We propose an alternating preconditioned gradient descent (APGD) method to solve low-rank matrix sensing problem in the over- parameterized and ill-conditioned setting.

Abstract: We consider solving the low-rank matrix sensing problem in the over-parameterized setting, where the specified rank is larger than the true rank. Precisely, our main objective is to recover a matrix $X^*\in\mathbb{R}^{n_1\times n_2}$ with rank $r_\star$ using an over-parameterized form $LR^{\top}$, where $L\in\mathbb{R}^{n_1\times r},\ R\in\mathbb{R}^{n_2\times r}$ and $\min\{n_1,n_2\}\ge r> r_\star$ with the true rank $r_\star$ being unknown. The commonly used methods tackling such a problem such as Factorized Gradient Descent (FGD) can only demonstrate sub-linear convergence behavior, and their performance could significantly deteriorate when the matrix condition number is relatively large. To address this issue, we propose the alternating preconditioned gradient descent (APGD) method that an inexpensive right preconditioner with a constant damping parameter is applied to the original gradient. We prove that even starting from a random initialization, APGD can recover the target matrix at a linear convergence rate in the over-parameterized situation, independent of the condition number. Notably, unlike previous FGD-based methods, APGD alternates between updating the two factor matrices, which eliminates the reliance on a small step size, thereby enabling faster convergence. Through a series of experiments, we demonstrate that APGD achieves the fastest convergence speed compared to other methods, and further possesses strong robustness with respect to step size, condition number and other parameters.

Supplementary Material: zip

Primary Area: optimization

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 9751

Loading