LSH Microbatches for Stochastic Gradients:  Value in Rearrangement

Eliav Buchnik; Edith Cohen; Avinatan Hassidim; Yossi Matias

LSH Microbatches for Stochastic Gradients: Value in Rearrangement

Eliav Buchnik, Edith Cohen, Avinatan Hassidim, Yossi Matias

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: Metric embeddings are immensely useful representations of associations between entities (images, users, search queries, words, and more). Embeddings are learned by optimizing a loss objective of the general form of a sum over example associations. Typically, the optimization uses stochastic gradient updates over minibatches of examples that are arranged independently at random. In this work, we propose the use of {\em structured arrangements} through randomized {\em microbatches} of examples that are more likely to include similar ones. We make a principled argument for the properties of our arrangements that accelerate the training and present efficient algorithms to generate microbatches that respect the marginal distribution of training examples. Finally, we observe experimentally that our structured arrangements accelerate training by 3-20\%. Structured arrangements emerge as a powerful and novel performance knob for SGD that is independent and complementary to other SGD hyperparameters and thus is a candidate for wide deployment.

Keywords: Stochastic Gradient Descent, Metric Embeddings, Locality Sensitive Hashing, Microbatches, Sample coordination

TL;DR: Accelerating SGD by arranging examples differently

10 Replies

Loading