Keywords: Approximate Nearest Neighbor Search, DiskANN, Information Retrieval, Distributed Systems, Search Engine Scalability
TL;DR: We put the nodes of a DiskANN graph into a distributed key-value store, and share the results of using this architecture in production at Bing.
Abstract: We present DistributedANN, a distributed vector search service that makes
it possible to search over a single 50 billion vector graph index
spread across over a thousand machines that offers $26$ms median query latency
and processes over 100,000 queries per second.
This is $6 \times$ more efficient than existing partitioning and
routing strategies that route the vector query to
a subset of partitions in a scale out vector search system.
DistributedANN is built using two well-understood components:
a distributed key-value store and an in-memory ANN index.
DistributedANN has replaced conventional scale-out architectures
for serving the Bing search engine, and we share our
experience from making this transition.
Submission Number: 4
Loading