Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Universality in halting time
Levent Sagun, Thomas Trogdon, Yann LeCun
Nov 04, 2016 (modified: Jan 18, 2017)ICLR 2017 conference submissionreaders: everyone
Abstract:The authors present empirical distributions for the halting time (measured by the number of iterations to reach a given accuracy) of optimization algorithms applied to two random systems: spin glasses and deep learning. Given an algorithm, which we take to be both the optimization routine and the form of the random landscape, the fluctuations of the halting time follow a distribution that remains unchanged even when the input is changed drastically. We observe two main classes, a Gumbel-like distribution that appears in Google searches, human decision times, QR factorization and spin glasses, and a Gaussian-like distribution that appears in conjugate gradient method, deep network with MNIST input data and deep network with random input data. This empirical evidence suggests presence of a class of distributions for which the halting time is independent of the underlying distribution under some conditions.
TL;DR:Normalized halting time distributions are independent of the input data distribution.
Conflicts:nyu.edu, fb.com, math.uci.edu
Enter your feedback below and we'll get back to you as soon as possible.