Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Stuck in a What? Adventures in Weight Space
Zachary C. Lipton
Feb 18, 2016 (modified: Feb 18, 2016)ICLR 2016 workshop submissionreaders: everyone
Abstract:Deep learning researchers commonly suggest
that converged models are stuck in local minima.
More recently, some researchers observed
that under reasonable assumptions,
the vast majority of critical points are saddle points, not true minima.
Both descriptions suggest that weights converge around a point in weight space,
be it a local optima or merely a critical point.
However, it's possible that neither interpretation is accurate.
As neural networks are typically over-complete,
it's easy to show the existence of vast continuous regions through weight space with equal loss.
In this paper, we build on recent work empirically characterizing the error surfaces of neural networks.
We analyze training paths through weight space,
presenting evidence that apparent convergence of loss
does not correspond to weights arriving at critical points,
but instead to large movements through flat regions of weight space.
While it's trivial to show that neural network error surfaces are globally non-convex,
we show that error surfaces are also locally non-convex,
even after breaking symmetry with a random initialization and also after partial training.
Conflicts:ucsd.edu, microsoft.com, amazon.com
Enter your feedback below and we'll get back to you as soon as possible.