From Search to Decision: A Framework for Adversarially Robust Approximate Nearest Neighbor Search

Alexandr Andoni; Themistoklis Haris; Esty Kelman; Krzysztof Onak

From Search to Decision: A Framework for Adversarially Robust Approximate Nearest Neighbor Search

Alexandr Andoni, Themistoklis Haris, Esty Kelman, Krzysztof Onak

Published: 29 Sept 2025, Last Modified: 12 Oct 2025NeurIPS 2025 - Reliable ML WorkshopEveryoneRevisionsBibTeXCC BY 4.0

Keywords: nearest-neighbor-search, sublinear-algorithms, adversarial-robustness, differential-privacy, fairness

TL;DR: We present a new framework for robust Approximate Nearest Neighbor search that uses differential privacy and a novel concentric LSH scheme to defend against an adversary who controls both the dataset and the queries.

Abstract: We design robust Approximate Nearest Neighbor (ANN) algorithms for a setting where an adversary controls both the dataset and $Q$ adaptive queries. Our primary contribution is a general framework that reduces search problems to a corresponding robust decision problem via a binary search tree construction. Given an oblivious decider, we robustify it by applying the Differential Privacy framework of Hassidim, Kaplan, Mansour, Matias, and Stemmer (JACM 2022), enhanced by privacy amplification via subsampling. For ANN specifically, the main challenge is designing the oblivious decider itself. To that end, we propose a sampling-based Locality-Sensitive Hashing (LSH) approach, inspired by the work of Aum\"uller, Har-Peled, Mahabadi, Pagh, and Silvestri (TODS 2022) on fair ANN. This method is made efficient against worst-case data distributions via a novel concentric LSH construction, which also yields an improved algorithm for the exact fair ANN problem. The result is a simple, general, and efficient algorithm for all but a narrow class of degenerate datasets. For the low-dimensional regime ($d = O(\sqrt{Q})$), we complement our general framework with specialized algorithms that provide a powerful ``for-all'' guarantee: correctness on every possible query with high probability. We propose novel metric covering constructions to simplify and improve prior approaches, enhancing performance for ANN in both Hamming and $\ell_p$ spaces.

Submission Number: 125

Loading