Abstract: Many theories of deep learning have shown that a deep network can require dramatically
fewer resources to represent a given function compared to a shallow
network. But a question remains: can these efficient representations be learned
using current deep learning techniques? In this work, we test whether standard
deep learning methods can in fact find the efficient representations posited by several
theories of deep representation. Specifically, we train deep neural networks
to learn two simple functions with known efficient solutions: the parity function
and the fast Fourier transform. We find that using gradient-based optimization, a
deep network does not learn the parity function, unless initialized very close to a
hand-coded exact solution. We also find that a deep linear neural network does not
learn the fast Fourier transform, even in the best-case scenario of infinite training
data, unless the weights are initialized very close to the exact hand-coded solution.
Our results suggest that not every element of the class of compositional functions
can be learned efficiently by a deep network, and further restrictions are necessary
to understand what functions are both efficiently representable and learnable.
Keywords: deep learning, sparse representations
TL;DR: Examining theory by testing if deep networks can learn to represent simple functions
7 Replies
Loading