Keywords: dimensionality reduction, random projection, random features, metric embedding
TL;DR: We construct random nonlinear maps (as compositions of several random features maps) that compress point sets while maintaining information about approximate distances between points.
Abstract: Motivated by the problem of compressing point sets into as few bits as possible while maintaining information about approximate distances between points, we construct random nonlinear maps $\varphi_\ell$ that compress point sets in the following way. For a point set $S$, the map $\varphi_\ell:\mathbb{R}^d \to N^{-1/2}\{-1,1\}^N$ has the property that storing $\varphi_\ell(S)$ (a sketch of $S$) allows one to report squared distances between points up to some multiplicative $(1\pm \epsilon)$ error with high probability. The maps $\varphi_\ell$ are the $\ell$-fold composition of a certain type of random feature mapping.
Compared to existing techniques, our maps offer several advantages. The standard method for compressing point sets by random mappings relies on the Johnson-Lindenstrauss lemma and involves compressing point sets with a random linear map. The main advantage of our maps $\varphi_\ell$ over random linear maps is that ours map point sets directly into the discrete cube $N^{-1/2}\{-1,1\}^N$ and so there is no additional step needed to convert the sketch to bits. For some range of parameters, our maps $\varphi_\ell$ produce sketches using fewer bits of storage space. We validate the method with experiments, including an application to nearest neighbor search.
Supplementary Material: zip
Primary Area: Learning theory
Submission Number: 13254
Loading