Dual-Branch HNSW Approach with Skip Bridges and LID-Driven Optimization

Hy Nguyen; Nguyen Hung Nguyen; Nguyen Linh Bao Nguyen; Srikanth Thudumu; Rajesh Vasa; Kon Mouzakis

Dual-Branch HNSW Approach with Skip Bridges and LID-Driven Optimization

Hy Nguyen, Nguyen Hung Nguyen, Nguyen Linh Bao Nguyen, Srikanth Thudumu, Rajesh Vasa, Kon Mouzakis

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Nearest Neighbor Search, Optimization

TL;DR: We propose an improved HNSW algorithm that tackles local optima and slow inference issues by using a dual-branch structure, LID-based insertion and bridge-building shortcuts, resulting in faster and more accurate performance across various datasets

Abstract: The Hierarchical Navigable Small World (HNSW) algorithm is widely used for approximate nearest neighbor (ANN) search, leveraging the principles of navigable small-world graphs. However, it faces some limitations. The first is the local optima problem, which arises from the algorithm's greedy search strategy, selecting neighbors based solely on proximity at each step. This often leads to cluster disconnections. The second limitation is that HNSW frequently fails to achieve logarithmic complexity, particularly in high-dimensional datasets, due to the exhaustive traversal through each layer. To address these limitations, we propose a novel algorithm that mitigates local optima and cluster disconnections while improving inference speed. The first component is a dual-branch HNSW structure with LID-based insertion mechanisms, enabling traversal from multiple directions. This improves outlier node capture, enhances cluster connectivity, and reduces the risk of local minima. The second component introduces a bridge-building technique that adds shortcuts between layers, enabling direct jumps and speeding up inference. Experiments on various benchmarks and datasets showed that our algorithm outperforms the original HNSW in both accuracy and speed. We evaluated six datasets across Computer Vision (CV), deep learning (DL), and Natural Language Processing (NLP), showing improvements of 2.5% in NLP, 15% in DL, and up to 35% in CV tasks. Inference speed is also improved by 12% across all datasets. Ablation studies revealed that LID-based insertion had the greatest impact on performance, followed by the dual-branch structure and bridge-building components.

Supplementary Material: zip

Primary Area: optimization

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 9532

Loading