Timestamp Approximate Nearest Neighbor Search Over High-Dimensional Vector Data

Published: 2025, Last Modified: 16 Sept 2025ICDE 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Unstructured data, such as images and texts, are increasingly represented as high-dimensional vectors for emerging AI applications like retrieval-augmented generation. A key operation in these applications is querying for vectors that are both semantically similar and temporally relevant. This operation can be formulated as Timestamp Approximate Nearest Neighbor Search (TANNS), where both the vectors and the query incorporate temporal attributes, aiming to retrieve the approximate nearest neighbors valid at the given timestamp. A naive solution is to create separate indexes for each timestamp, which enables accurate and fast searches but incurs high update latency and excessive storage demands. In this paper, we introduce the timestamp graph, a novel structure that supports rapid index updates while minimizing storage costs. Exploiting the temporal locality of changes in valid vectors, our timestamp graph effectively manages a unified index across all historical timestamps, thereby substantially reducing storage overhead. Moreover, we design the historic neighbor tree, which further compresses the space complexity to that of a single-timestamp index. Extensive evaluations on four standard datasets show that our method achieves over 99% accuracy while improving the query efficiency by 4.4× to 138.1× than existing solutions.
Loading