Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
The loss surface and expressivity of deep convolutional neural networks
Nov 03, 2017 (modified: Nov 03, 2017)ICLR 2018 Conference Blind Submissionreaders: everyoneShow Bibtex
Abstract:We analyze the expressiveness and loss surface of practical deep convolutional
neural networks (CNNs) with shared weights and max pooling layers. We show
that such CNNs produce linearly independent features at a “wide” layer which
has more neurons than the number of training samples. This condition holds e.g.
for the VGG network. Furthermore, we provide for such wide CNNs necessary
and sufficient conditions for global minima with zero training error. For the case
where the wide layer is followed by a fully connected layer we show that almost
every critical point of the empirical loss is a global minimum with zero training
error. Our analysis suggests that both depth and width are very important in deep
learning. While depth brings more representational power and allows the network
to learn high level features, width smoothes the optimization landscape of the
loss function in the sense that a sufficiently wide network has a well-behaved loss
surface with almost no bad local minima.
Keywords:convolutional neural networks, loss surface, expressivity, critical point, global minima, linear separability
Enter your feedback below and we'll get back to you as soon as possible.