Aggregating Tree for Searching in Billion Scale High Dimensional Data

Published: 2016, Last Modified: 13 Jun 2025ICDM Workshops 2016EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We present a novel nearest neighbor search scheme named aggregating tree (A-Tree) for high dimensional data that uses vector quantization encodings (VQ-encodings) to build a radix tree, and perform the nearest neighbor search by beam search. To search accurately and efficiently, we suggest VQ-encodings to satisfy locally aggregating encoding criterion: for any node of the corresponding A-Tree, neighboring vectors should aggregate in fewer subtrees to make beam search efficient. We suggest another two criteria for effective VQ-encodings which resembles balanced and uncorrelated bit criteria for hashing codes. We use generalized residual vector quantization (GRVQ) encodings to build A-Tree to meet the suggested criteria, and this combination shows significantly better performances. Our methods are validated on several standard benchmark datasets, including one containing a billion vectors. Experimental results show the superior efficiency and effectiveness of our proposed methods compared to the state-of-the-art.
Loading