Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
LSH-SAMPLING BREAKS THE COMPUTATIONAL CHICKEN-AND-EGG LOOP IN ADAPTIVE STOCHASTIC GRADIENT ESTIMATION
Nov 07, 2017 (modified: Nov 07, 2017)ICLR 2018 Conference Blind Submissionreaders: everyoneShow Bibtex
Abstract:Stochastic Gradient Descent or SGD is the most popular algorithm for large-scale optimization. In SGD, the gradient is estimated by uniform sampling with sample size one. There have been several results that show better gradient estimates, using weighted non-uniform sampling, which leads to faster convergence. Unfortunately, the per-iteration cost of maintaining this adaptive distribution is costlier than the exact gradient computation itself, which create a chicken-and-egg loop making the fast convergence useless. In this paper, we break this chicken-and-egg loop and provide the first demonstration of a sampling scheme, which leads to superior gradient estimation, while keeping the sampling cost per iteration similar to the uniform sampling. Such a scheme is possible due to recent advances in Locality Sensitive Hashing (LSH) literature. As a consequence, we improve the running time of all existing gradient descent algorithms.
TL;DR:We improve the running of all existing gradient descent algorithms.