DiskHIVF: Disk-Resident Hierarchical Inverted File Index For Billion-scale Approximate Nearest Neighbor Search

15 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: billion-scale;disk-resident;approximate nearest neighbor search;
TL;DR: This paper introduces a novel disk-memory hybrid approximate nearest neighbor indexing algorithm, DiskHIVF, which offers superior memory space complexity and search performance, capable of reducing memory overhead by up to a hundred times.
Abstract: The in-memory algorithms for approximate nearest neighbor search (ANNS) has demonstrated remarkable success. However, as the scale of vector data grows, the memory demands of in-memory indexing become increasingly prohibitive. A promising solution lies in hybrid memory-disk implementations, which offload the bulk of data storage to cost-efficient devices such as Solid State Drives (SSDs) while retaining only frequently accessed data in memory. Despite this, existing hybrid memory-disk indexing methods suffer from memory overheads that scale proportionally with the number and dimensionality of the vectors, limiting their memory savings to a modest 5–20$\times$. In this paper, we introduce the Disk-Resident Hierarchical Inverted File Index (DiskHIVF), a novel hybrid memory-disk indexing algorithm with a memory space complexity of ${O(\sqrt{N} \cdot d + N)}$, where ${N}$ is the number of vectors and ${d}$ is their dimensionality. Leveraging its superior space complexity, DiskHIVF achieves several hundred times memory savings compared to the original vectors, and 10–30$\times$ reduction compared to state-of-the-art methods. Experimental results on four different datasets demonstrate that DiskHIVF is 1.2-2.3$\times$ faster than the state-of-the-art hybrid indexing solutions at achieving the same recall quality of 90\%. These results indicate that our approach can significantly reduce the overhead of machine resources while maintaining high search performance.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 5411
Loading