Abstract: Locality sensitive hashing (LSH) has been extensively employed to solve the problem of c-approximate nearest neighbor search (c-ANNS) in high-dimensional spaces. However, the search performance of LSH is degenerated with the number of data increasing. To this end, we propose an efficient method called Data Aware Sensitive Hashing (DASH) to deal with this drawback. DASH is the data-dependent hashing algorithm under considering the residual distance prior. DASH leverages this prior knowledge and provides theoretical guarantee for search results. Our experimental results with various datasets show that DASH achieves better search performance and the running time can reach up to about 4–40x speedups compared with other state-of-the-art methods.
Loading