Fast and Differentiable Matrix Inverse and Its Extension to SVD

Xingyu Xie; Hao Kong; Jianlong Wu; Guangcan Liu; Zhouchen Lin

Fast and Differentiable Matrix Inverse and Its Extension to SVD

Xingyu Xie, Hao Kong, Jianlong Wu, Guangcan Liu, Zhouchen Lin

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Withdrawn SubmissionReaders: Everyone

Keywords: learning-based optimization, learning-based iterative method, differentiable matrix inverse, differentiable singular value decomposition, convergence, generalization

Abstract: Matrix inverse (Minv) and singular value decomposition (SVD) are among the most widely used matrix operations in massive data analysis, machine learning, and statistics. Although well-studied, they still encounter difficulties in practical use due to inefficiency and non-differentiability. In this paper, we aim at solving efficiency and differentiability issues through learning-based methods. First of all, to perform matrix inverse, we provide a differentiable yet efficient way, named LD-Minv, which is a learnable deep neural network (DNN) with each layer being an $L$-th order matrix polynomial. We show that, with proper initialization, the difference between LD-Minv's output and exact pseudo-inverse is in the order $O(exp{-L^K})$ where $K$ is the depth of the LD-Minv. Moreover, by learning from data, LD-Minv further reduces the difference between the output and the exact pseudo-inverse. We prove that gradient descent finds an $\epsilon$-error minimum within $O(nKL\log(1/\epsilon))$ steps for LD-Minv, where n is the data size. At last, we provide the generalization bound for LD-Minv in both under-parameterized and over-parameterized settings. As an application of LD-Minv, we provide a learning-based optimization method to solve the problem with orthogonality constraints and utilize it to differentiate SVD (D-SVD). We also offer a theoretical generalization guarantee for D-SVD. Finally, we demonstrate the superiority of our methods on the synthetic and real data in the supplementary materials.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

One-sentence Summary: In this paper, we provide a learnable, differentiable, and efficient way to perform matrix inverse and SVD with the theoretical guarantee for optimization and generalization.

Reviewed Version (pdf): https://openreview.net/references/pdf?id=yT1-NYK7qT

5 Replies

Loading