Abstract: The most popular graph indices for vector search use principles from computational geometry to build the graph. Hence, their formal graph navigability guarantees are only valid in Euclidean space. In this work, we show that machine learning can be used to build graph indices for vector search in metric and non-metric vector spaces (e.g., for inner product similarity). From this novel perspective, we introduce the Support Vector Graph (SVG), a new type of graph index that leverages kernel methods to establish the graph connectivity and that comes with formal navigability guarantees valid in metric and non-metric vector spaces. In addition, we interpret the most popular graph indices, including HNSW and DiskANN, as particular specializations of SVG and show that new indices can be derived from the principles behind this specialization. Finally, we propose SVG-L0 that incorporates an $\ell_0$ sparsity constraint into the SVG kernel method to build graphs with a bounded out-degree. This yields a principled way of implementing this practical requirement, in contrast to the traditional heuristic of simply truncating the out edges of each node. Additionally, we show that SVG-L0 has a self-tuning property that avoids the heuristic of using a set of candidates to find the out-edges of each node and that keeps its computational complexity in check.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=OLgwQE8JSN&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DTMLR%2FAuthors%23your-submissions)
Changes Since Last Submission: We submitted this paper earlier this week (submission #5198) and it was rejected without review because of having narrower margins. This was an honest mistake on our side. When we were converting to the TMLR format, we missed removing one latex command that was indeed decreasing the margins (the margins of the default latex article template are very narrow). I hope that the Editors in Chief understand that this was not intentional. In fact, the length of the paper was not altered as part of this change, the original submission had 14 pages as well as this one. Our most sincere apologies as we were not trying at all to game the system in any way.
We also fixed one or two typos we found since the original submission.
Assigned Action Editor: ~Jeff_Phillips1
Submission Number: 5207
Loading