Weak Correlations as the Underlying Principle for Linearization of Gradient-Based Learning Systems

20 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: metric learning, kernel learning, and sparse coding
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Neural Tangent Kernel, Deep Learning and representational learning, Kernels, Statistical Mechanics
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Numerous deep learning models including wide neural networks, can be conceptualized as nonlinear dynamical physical systems with a large number of interacting degrees of freedom, which, in the infinite limit, exhibit simplified dynamics. In this work we analyze gradient descent based learning systems that demonstrate a linear learning structure in their parameters, analogous to the neural tangent kernel. We establish that this linearity is equivalent to weak correlations between the first and higher derivatives of the hypothesis function with respect to the parameters around their initial values, suggesting that these weak correlations are the underlying reason for the observed linearization of these systems. We demonstrate the weak correlations structure in the example of neural networks in the large width limit. By leveraging the equivalence between linearity and weak correlations, we derive a bound on the deviation from linearity along the training path for stochastic gradient descent. To facilitate our proof, we introduce a method to bound the asymptotic behavior of random tensors, and demonstrate that any such tensor possesses a unique tight bound.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: pdf
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2730
Loading