The asymptotic spectrum of the Hessian of DNN throughout training

Arthur Jacot; Franck Gabriel; Clement Hongler

The asymptotic spectrum of the Hessian of DNN throughout training

Arthur Jacot, Franck Gabriel, Clement Hongler

Published: 20 Dec 2019, Last Modified: 05 May 2023ICLR 2020 Conference Blind SubmissionReaders: Everyone

Keywords: theory of deep learning, loss surface, training, fisher information matrix

TL;DR: Description of the limiting spectrum of the Hesian of the loss surface of DNNs in the infinite-width limit.

Abstract: The dynamics of DNNs during gradient descent is described by the so-called Neural Tangent Kernel (NTK). In this article, we show that the NTK allows one to gain precise insight into the Hessian of the cost of DNNs: we obtain a full characterization of the asymptotics of the spectrum of the Hessian, at initialization and during training.

Original Pdf: pdf

7 Replies

Loading