DistributedANN: Efficient Scaling of a Single DiskANN Graph Across Thousands of Computers

Published: 12 Jun 2025, Last Modified: 06 Jul 2025VecDB 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Approximate Nearest Neighbor Search, DiskANN, Information Retrieval, Distributed Systems, Search Engine Scalability
TL;DR: We put the nodes of a DiskANN graph into a distributed key-value store, and share the results of using this architecture in production at Bing.
Abstract: We present DistributedANN, a distributed vector search service that makes it possible to search over a single 50 billion vector graph index spread across over a thousand machines that offers $26$ms median query latency and processes over 100,000 queries per second. This is $6 \times$ more efficient than existing partitioning and routing strategies that route the vector query to a subset of partitions in a scale out vector search system. DistributedANN is built using two well-understood components: a distributed key-value store and an in-memory ANN index. DistributedANN has replaced conventional scale-out architectures for serving the Bing search engine, and we share our experience from making this transition.
Submission Number: 4
Loading