Abstract: Representation learning has transformed the problem of informa-
tion retrieval into one of finding the approximate set of nearest
neighbors in a high dimensional vector space. With limited hard-
ware resources and time-critical queries, the retrieval engines face
an inherent tension between latency, accuracy, scalability, compact-
ness, and the ability to load balance in distributed settings. To im-
prove the trade-off, we propose a new algorithm, called BaLanced
Index for Scalable Search (BLISS), a highly tunable indexing
algorithm with enviably small index sizes, making it easy to scale
to billions of vectors. It iteratively refines partitions of items by
learning the relevant buckets directly from the query-item rele-
vance data. To ensure that the buckets are balanced, BLISS uses the
power-of-𝐾 choices strategy. We show that BLISS provides supe-
rior load balancing with high probability (and under very benign
assumptions). Due to its design, BLISS can be employed for both
near-neighbor retrieval (ANN problem) and extreme classification
(XML problem). For the case of ANN, we train and index 4 datasets
with billion vectors each. We compare the recall, inference time,
indexing time, and index size for BLISS with the two most popular
and well-optimized libraries- Hierarchical Navigable Small World
(HNSW) graph and Facebook’s FAISS. BLISS requires 100× lesser
RAM than HNSW, making it fit in memory on commodity machines
while taking a similar inference time as HNSW for the same recall.
Against FAISS-IVF, BLISS achieves similar performance with 3-4×
less memory requirement. BLISS is both data and model parallel,
making it ideal for distributed implementation for training and
inference. For the case of XML, BLISS surpasses the best baselines’
precision while being 5× faster for inference on popular multi-label
datasets with half a million classes.
0 Replies
Loading