Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture

Libin Zhu; Chaoyue Liu; Misha Belkin

Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture

Libin Zhu, Chaoyue Liu, Misha Belkin

Published: 31 Oct 2022, Last Modified: 11 Oct 2022NeurIPS 2022 AcceptReaders: Everyone

Keywords: wide neural networks, directed acyclic graph, transition to linearity, neural tangent kernel, over-parameterization

TL;DR: Feedforward neural networks corresponding to arbitrary directed acyclic graphs undergo transition to linearity as their “width” approaches infinity.

Abstract: In this paper we show that feedforward neural networks corresponding to arbitrary directed acyclic graphs undergo transition to linearity as their ``width'' approaches infinity. The width of these general networks is characterized by the minimum in-degree of their neurons, except for the input and first layers. Our results identify the mathematical structure underlying transition to linearity and generalize a number of recent works aimed at characterizing transition to linearity or constancy of the Neural Tangent Kernel for standard architectures.

Supplementary Material: pdf

16 Replies

Loading