Keywords: nearest-neighbor-search, adversarial robustness, differential privacy, locality sensitive hashing, randomized algorithms
TL;DR: We develop time and space efficient algorithms for solving the approximate nearest neighbors problem in the presence of adversarially generated queries.
Abstract: We study the Approximate Nearest Neighbor (ANN) problem under a powerful adaptive adversary that controls both the dataset and a sequence of $Q$ queries.
For the high-dimensional regime $d = \omega(\sqrt{Q})$, we develop a sequence of algorithms with progressively stronger guarantees. We first establish a novel connection between adaptive security and *fairness*, leveraging fair ANN search [Aumuller et al., 2022] to hide internal randomness from the adversary with information-theoretic guarantees. To achieve data-independent performance, we then reduce the search problem to a robust decision primitive, solved using a differentially private mechanism [Hassidim et al., 2022] on a Locality-Sensitive Hashing (LSH) data structure. This approach, however, faces an inherent $\sqrt{n}$ query time barrier. To break this barrier, we propose a novel concentric-annuli LSH construction that synthesizes these fairness and differential privacy techniques. The analysis introduces a new method for robustly releasing timing information from the underlying algorithm instances and, as a corollary, also improves existing results for fair ANN.
In addition, for the low-dimensional regime $d = O(\sqrt{Q})$, we propose specialized algorithms that provide a strong *for-all* guarantee: correctness on *every* possible query with high probability. We introduce novel metric covering constructions that simplify and improve prior approaches for ANN in Hamming and $\ell_p$ spaces.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 9172
Loading