Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
The loss surface and expressivity of deep convolutional neural networks
Quynh Nguyen, Matthias Hein
Feb 11, 2018 (modified: Feb 15, 2018)ICLR 2018 Workshop Submissionreaders: everyoneShow Bibtex
Abstract:We analyze the expressiveness and loss surface of practical deep convolutional neural networks (CNNs) with shared weights. We show that such CNNs produce linearly independent features (and thus linearly separable) at every ``wide'' layer which has more neurons than the number of training samples. This condition holds e.g. for the VGG network. Furthermore, we provide for such wide CNNs necessary and sufficient conditions for global minima with zero training error. For the case where the wide layer is followed by a fully connected layer we show that almost every critical point of the empirical loss is a global minimum with zero training error. Our analysis suggests that both depth and width are equally important in deep learning. While depth brings more representational power and allows the network to learn high level features, width smoothes the optimization landscape of the loss function in the sense that a sufficiently wide CNN has a well-behaved loss surface with almost no bad local minima.
TL;DR:Deep CNN learn linearly independent features at every wide layer and potentially has almost no bad local minima
Keywords:loss surface, deep CNN, local minima, global minima
Enter your feedback below and we'll get back to you as soon as possible.