Track: Search and retrieval-augmented AI
Keywords: Approximate nearest neighbor search, Learning-to-index
Abstract: Approximate nearest neighbor (ANN) search is fundamental in various applications such as information retrieval.
To enhance efficiency, partition-based methods are proposed to narrow the search space by probing partial partitions, yet they face two common issues.
First, in the query phase, a widely adopted strategy in existing studies such as IVF is to probe partitions based on the distance ranks of a query to partition centroids.
This inevitably leads to irrelevant partition probing, since data distribution is not considered.
Second, in the partition construction phase, all the partition-based methods have the boundary problem that separates a query's $k$NN to multiple partitions and produces a long-tailed $k$NN distribution, degrading the optimal $nprobe$ (i.e., the number of probing partitions) and the search efficiency.
To address these problems, we propose LIRA, a LearnIng-based queRy-aware pArtition framework. Specifically, we propose a probing model to learn and directly probe the partitions containing the $k$NN of a query. Probing partitions with the model can reduce probing waste and allow for query-aware probing with query-specific $nprobe$. Moreover, we incorporate the probing model into a learning-based redundancy strategy to mitigate the adverse impact of the long-tailed $k$NN distribution on partition probing.
Extensive experiments on real-world vector datasets demonstrate the superiority of LIRA in the trade-off among accuracy, latency, and query fan-out.
The results show that LIRA consistently reduces the latency and the query fan-out up to 30\%.
Submission Number: 1603
Loading