Learning Dynamics of Deep Networks Admit Low-Rank Tensor Descriptions

Christopher H. Stock; Alex H. Williams; Madhu S. Advani; Andrew M. Saxe; Surya Ganguli

Learning Dynamics of Deep Networks Admit Low-Rank Tensor Descriptions

Christopher H. Stock, Alex H. Williams, Madhu S. Advani, Andrew M. Saxe, Surya Ganguli

12 Feb 2018 (modified: 05 May 2023)ICLR 2018 Workshop SubmissionReaders: Everyone

Abstract: Deep feedforward neural networks are associated with complicated, nonconvex objective functions. Yet, simple optimization algorithms can identify parameters that generalize well to held-out data. We currently lack detailed descriptions of this learning process, even on a qualitative level. We propose a simple tensor decomposition model to study how hidden representations evolve over learning. This approach precisely extracts the correct dynamics of learning in linear networks, which admit closed form solutions. On deep, nonlinear architectures performing image classification (CIFAR-10), we find empirically that a low-rank tensor model can explain a large fraction of variance while extracting meaningful features, such as stage-like learning and selectivity to inputs.

Keywords: Learning Dynamics, Deep Networks, Tensor Decomposition

TL;DR: We propose a simple unsupervised learning procedure based on tensor decomposition to concisely describe learning dynamics in deep networks.

4 Replies

Loading