Optimistic Query Routing in Clustering-based Approximate Maximum Inner Product Search

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY-NC-ND 4.0
Keywords: maximum inner product search, clustering-based nearest neighbor search, query routing, sketching, optimism principle
TL;DR: We present an optimistic query routing protocol in clustering-based approximate maximum inner product search that reaches the same accuracy as SoTA routers by probing 50% fewer data points.
Abstract: Clustering-based nearest neighbor search algorithms partition points into shards to form an index, and search only a subset of shards to process a query. Even though search efficacy is heavily influenced by the algorithm that identifies the shards to probe, it has received little attention in the literature. We study routing in clustering-based maximum inner product search, which includes cosine similarity search. We unpack existing routers and notice the surprising role of optimism. We then take a page from the sequential decision making literature and formalize that insight following the principle of ``optimism in the face of uncertainty.'' In particular, we present a framework that incorporates the moments of the distribution of inner products within each shard to estimate the maximum inner product. We then develop a practical instance of our algorithm that uses only the first two moments to reach the same accuracy as state-of-the-art routers by probing up to $50\%$ fewer points on benchmark datasets without compromising efficiency. Our algorithm is also space-efficient: we design a sketch of the second moment whose size is independent of the number of points and requires $\mathcal{O}(1)$ vectors per shard.
Primary Area: Other (please use sparingly, only use the keyword field for more details)
Submission Number: 9382
Loading