Implicit NNs are Almost Equivalent to Not-so-deep Explicit NNs for High-dimensional Gaussian Mixtures

15 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: random matrix theory, implicit neural networks, deep equilibrium models, high-dimensional statistics
Abstract: Implicit neural networks (NNs) have demonstrated remarkable success in various tasks. However, there is a lack of theoretical understanding of the connections and differences between implicit and explicit networks. In this paper, we employ random matrix theory (RMT) to analyze the eigenspectra of neural tangent kernels (NTKs) and conjugate kernels (CKs) for a broad range of implicit NNs, when the input data are drawn from a high-dimensional Gaussian mixture model. Surprisingly, the spectral behavior of Implicit-CKs and NTKs depend on the activation function and initial weight variances, but \emph{only} via a system of four nonlinear equations. As a direct (and important!) consequence of our theoretical analysis, we demonstrate that (as shallow as) a two-hidden-layer explicit NN with well-designed activations can share the same CK or NTK eigenspectra with \emph{any} given implicit NN. These findings offer practical benefits, and allow for the design of memory-efficient explicit NNs that match implicit NNs' performance without incurring the computational overhead of fixed-point iterations. The proposed theory is supported by empirical results on both synthetic and real-world datasets.
Supplementary Material: zip
Primary Area: learning theory
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 275
Loading