Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Demystifying overcomplete nonlinear auto-encoders: fast SGD convergence towards sparse representation from random initialization
Cheng Tang, Claire Monteleoni
Feb 15, 2018 (modified: Feb 15, 2018)ICLR 2018 Conference Blind Submissionreaders: everyoneShow Bibtex
Abstract:Auto-encoders are commonly used for unsupervised representation learning and for pre-training deeper neural networks.
When its activation function is linear and the encoding dimension (width of hidden layer) is smaller than the input dimension, it is well known that auto-encoder is optimized to learn the principal components of the data distribution (Oja1982).
However, when the activation is nonlinear and when the width is larger than the input dimension (overcomplete), auto-encoder behaves differently from PCA, and in fact is known to perform well empirically for sparse coding problems.
We provide a theoretical explanation for this empirically observed phenomenon, when rectified-linear unit (ReLu) is adopted as the activation function and the hidden-layer width is set to be large.
In this case, we show that, with significant probability, initializing the weight matrix of an auto-encoder by sampling from a spherical Gaussian distribution followed by stochastic gradient descent (SGD) training converges towards the ground-truth representation for a class of sparse dictionary learning models.
In addition, we can show that, conditioning on convergence, the expected convergence rate is O(1/t), where t is the number of updates.
Our analysis quantifies how increasing hidden layer width helps the training performance when random initialization is used, and how the norm of network weights influence the speed of SGD convergence.
TL;DR:theoretical analysis of nonlinear wide autoencoder
Keywords:stochastic gradient descent, autoencoders, nonconvex optimization, representation learning, theory
Enter your feedback below and we'll get back to you as soon as possible.