Recovering the Lowest Layer of Deep Networks with High Threshold Activations

Surbhi Goel; Rina Panigrahy

Recovering the Lowest Layer of Deep Networks with High Threshold Activations

Surbhi Goel, Rina Panigrahy

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: Giving provable guarantees for learning neural networks is a core challenge of machine learning theory. Most prior work gives parameter recovery guarantees for one hidden layer networks, however, the networks used in practice have multiple non-linear layers. In this work, we show how we can strengthen such results to deeper networks -- we address the problem of uncovering the lowest layer in a deep neural network under the assumption that the lowest layer uses a high threshold before applying the activation, the upper network can be modeled as a well-behaved polynomial and the input distribution is gaussian.

Keywords: Deep Learning, Parameter Recovery, Non-convex optimization, high threshold activation

TL;DR: We provably recover the lowest layer in a deep neural network assuming that the lowest layer uses a "high threshold" activation and the above network is a "well-behaved" polynomial.

7 Replies

Loading