Hybrid Approximate Nearest Neighbor Indexing and Search (HANNIS) for Large Descriptor Databases

M. M. Mahabubur Rahman, Jelena Tesic

2022 (modified: 19 Apr 2023)Big Data 2022Readers: Everyone

Abstract: In this paper, we present a novel method for efficient and effective retrieval of similar deep descriptors. Our new hybrid method for indexing and searching for the approximate nearest neighbors in high-dimensional large deep-descriptor databases retrieves truly similar items in the database, even if the retrieval set is large. The proposed solution —- hybrid approximate nearest neighbor indexing and search (HANNIS) —- partitions the whole data space using the kmeans++ algorithm and then indexes each cluster using adapted hierarchical navigable graphs. This approach enables us to load items that are truly close to the incoming query at retrieval time. HANNIS outperforms all state-of-the-art methods in terms of recall at depths of up to 100 and offers consistent index loading and retrieval performance.

0 Replies