Learning Dynamics of Deep Networks Admit Low-Rank Tensor Descriptions

Christopher H. Stock, Alex H. Williams, Madhu S. Advani, Andrew M. Saxe, Surya Ganguli

Feb 12, 2018 (modified: Jun 04, 2018) ICLR 2018 Workshop Submission readers: everyone Show Bibtex
  • Abstract: Deep feedforward neural networks are associated with complicated, nonconvex objective functions. Yet, simple optimization algorithms can identify parameters that generalize well to held-out data. We currently lack detailed descriptions of this learning process, even on a qualitative level. We propose a simple tensor decomposition model to study how hidden representations evolve over learning. This approach precisely extracts the correct dynamics of learning in linear networks, which admit closed form solutions. On deep, nonlinear architectures performing image classification (CIFAR-10), we find empirically that a low-rank tensor model can explain a large fraction of variance while extracting meaningful features, such as stage-like learning and selectivity to inputs.
  • Keywords: Learning Dynamics, Deep Networks, Tensor Decomposition
  • TL;DR: We propose a simple unsupervised learning procedure based on tensor decomposition to concisely describe learning dynamics in deep networks.