Keywords: local minimum, smooth activation, neural networks
TL;DR: We investigate the existence of a bad local minimum of neural networks with general smooth activation functions.
Abstract: Understanding the loss surface of neural networks is essential to the understanding of deep learning. However, the existence of a bad local minimum has not yet been fully identified. We investigate the existence of a bad local minimum of the $2$-layer and $3$-layer neural networks with general smooth activation functions. We provide constructive proof using the algebraic nature of the activation functions. We show this for realistic settings where the data $(X,Y)$ have a positive measure. We hope that such results give theoretical foundations for studies related to local minima and loss surfaces.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Theory (eg, control theory, learning theory, algorithmic game theory)
10 Replies
Loading