Abstract: We study the problem of \emph{vector set search} with \emph{vector set queries}. This task is analogous to traditional near-neighbor search, with the exception that both the query and each element in the collection are \textit{sets} of vectors. We identify this problem as a core subroutine for many web applications and find that existing solutions are unacceptably slow. Towards this end, we present a new approximate search algorithm, DESSERT ({\bf D}ESSERT {\bf E}ffeciently {\bf S}earches {\bf S}ets of {\bf E}mbeddings via {\bf R}etrieval {\bf T}ables). DESSERT is a general tool with strong theoretical guarantees and excellent empirical performance. When we integrate DESSERT into ColBERT, a highly optimized state-of-the-art semantic search method, we find a 2-5x speedup on the MSMarco passage ranking task with minimal loss in recall, underscoring the effectiveness and practical applicability of our proposal.
0 Replies
Loading