Abstract: Vector quantization is one of the critical techniques which enables dense retrieval for realtime applications. The recent study shows that vanilla vector quantization methods, like those implemented by FAISS [8], are lossy and prone to limited retrieval performances when large acceleration ratios are needed [14, 16, 18]. Besides, there have also been multiple algorithms which make the retriever and VQ better collaborated to alleviate such a loss. On top of these progresses, we develop LibVQ, which optimizes vector quantization for efficient dense retrieval. Our toolkit is highlighted for three advantages. 1. Effectiveness. The retrieval quality can be substantially improved over the vanilla implementations of VQ. 2. Simplicity. The optimization can be conducted in a lowcode fashion, and the optimization results can be easily loaded to ANN indexes to support downstream applications. 3. Universality. The optimization is agnostic to the embedding's learning process, and may accommodate different input conditions and ANN back-ends with little modification of the workflow. LibVQ may also support rich applications beyond dense retrieval, e.g., embedding compression, topic modeling, and de-duplication. In this demo, we provide comprehensive hand-on examples and evaluations for LibVQ. The toolkit is publicly released at: https://github.com/staoxiao/LibVQ/tree/demo.
0 Replies
Loading