Abstract: Similarity search in very high dimensions is vital for many scientific research activities as well as real applications. A high performance, scalable, and optimal quality solution to the problem still remains challenging. We propose a vote count based algorithm using p-stable distribution for approximate similarity search. Approximate similarity search effectively serves purpose for many real applications. Our algorithm is efficient and scalable with both dimension and database size. We also propose a novel hardware implementation of the algorithm using simple modification to Random Access Memory (RAM). The hardware design gives real time search for millions of points at practical cost. We empirically achieve high accuracy for query results using our algorithm on 128 dimensional synthetic and real datasets.
Loading