Faster Directional Convergence of Linear Neural Networks under Spherically Symmetric Data

Dachao Lin; Ruoyu Sun; Zhihua Zhang

Faster Directional Convergence of Linear Neural Networks under Spherically Symmetric Data

Dachao Lin, Ruoyu Sun, Zhihua Zhang

Published: 09 Nov 2021, Last Modified: 05 May 2023NeurIPS 2021 PosterReaders: Everyone

Keywords: optimization for deep linear networks, global directional convergence

TL;DR: We show global directional convergence guarantees for (deep) linear networks with spherically symmetric data distribution,

Abstract: In this paper, we study gradient methods for training deep linear neural networks with binary cross-entropy loss. In particular, we show global directional convergence guarantees from a polynomial rate to a linear rate for (deep) linear networks with spherically symmetric data distribution, which can be viewed as a specific zero-margin dataset. Our results do not require the assumptions in other works such as small initial loss, presumed convergence of weight direction, or overparameterization. We also characterize our findings in experiments.

Supplementary Material: pdf

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

12 Replies

Loading