Keywords: Deep Neural Network, Schulz Iteration, Matrix Inversion
TL;DR: We propose the Schulz Neural Network based on Schulz iteration for solving matrix inverse.
Abstract: While neural networks have emerged as powerful tools for solving optimization problems, demonstrating performance comparable to or surpassing traditional solvers, their application to fundamental numerical optimization problems remains underexplored. Specifically, limited progress has been made in developing neural network-Based approaches for matrix inversion --- a cornerstone problem in unconstrained numerical optimization. This work presents SchulzNN, the first neural network-Based solver for matrix inversion inspired by the Schulz iterative method. Our architecture innovatively simulates traditional iteration processes through parametric learning, preserving theoretical convergence guarantees while enhancing computational flexibility. We rigorously evaluate both single-layer SchulzNN and deep variants on (1) diverse matrix families with practical significance, and (2) matrices beyond the convergence conditions of classical Schulz iteration. Notably, we establish a systematic framework for analyzing model adaptation through fine-tuning strategies. The accuracy and efficiency of the proposed SchulzNN are demonstrated by numerical examples for matrix inversion.
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 22695
Loading