- Abstract: Giving provable guarantees for learning neural networks is a core challenge of machine learning theory. Most prior work gives parameter recovery guarantees for one hidden layer networks, however, the networks used in practice have multiple non-linear layers. In this work, we show how we can strengthen such results to deeper networks -- we address the problem of uncovering the lowest layer in a deep neural network under the assumption that the lowest layer uses a high threshold before applying the activation, the upper network can be modeled as a well-behaved polynomial and the input distribution is gaussian.
- Keywords: Deep Learning, Parameter Recovery, Non-convex optimization, high threshold activation
- TL;DR: We provably recover the lowest layer in a deep neural network assuming that the lowest layer uses a "high threshold" activation and the above network is a "well-behaved" polynomial.