Optimization Can Learn Johnson Lindenstrauss Embeddings

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Optimization, Non-Convex Optimization, Embeddings, Projections, Derandomization, Gradient Descent, Dimensionality Reduction
TL;DR: We give a novel derandomization of JL via optimization, that avoids all bad local minima in the non-convex landscape by a diffusion-like process where we move through the space of randomized solution samplers, sequentially reducing the variance.
Abstract: Embeddings play a pivotal role across various disciplines, offering compact representations of complex data structures. Randomized methods like Johnson-Lindenstrauss (JL) provide state-of-the-art and essentially unimprovable theoretical guarantees for achieving such representations. These guarantees are worst-case and in particular, neither the analysis, ${\textit{nor the algorithm}}$, takes into account any potential structural information of the data. The natural question is: must we randomize? Could we instead use an optimization-based approach, working directly with the data? A first answer is no: as we show, the distance-preserving objective of JL has a non-convex landscape over the space of projection matrices, with many bad stationary points. But this is not the final answer. We present a novel method motivated by diffusion models, that circumvents this fundamental challenge: rather than performing optimization directly over the space of projection matrices, we use optimization over the larger space of $\textit{random solution samplers}$, gradually reducing the variance of the sampler. We show that by moving through this larger space, our objective converges to a deterministic (zero variance) solution, avoiding bad stationary points. This method can also be seen as an optimization-based derandomization approach, and is an idea and method that we believe can be applied to many other problems.
Supplementary Material: zip
Primary Area: Optimization (convex and non-convex, discrete, stochastic, robust)
Submission Number: 18641
Loading