Fast spectral analysis for approximate nearest neighbor search

Published: 01 Jan 2022, Last Modified: 07 Oct 2024Mach. Learn. 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In large-scale machine learning, of central interest is the problem of approximate nearest neighbor (ANN) search, where the goal is to query particular points that are close to a given object under certain metric. In this paper, we develop a novel data-driven ANN search algorithm where the data structure is learned by fast spectral technique based on s landmarks selected by approximate ridge leverage scores. We show that with overwhelming probability, our algorithm returns the \((1+\epsilon /4)\)-ANN for any approximation parameter \(\epsilon \in (0,1)\). A remarkable feature of our algorithm is that it is computationally efficient. Specifically, learning k-length hash codes requires \(O((s^3+ns^2)\log n)\) running time and \(O(d^2)\) extra space, and returning the \((1+\epsilon /4)\)-ANN of the query needs \(O(k\log n)\) running time. The experimental results on computer vision and natural language understanding tasks demonstrate the significant advantage of our algorithm compared to state-of-the-art methods.
Loading