TensorSearch: Parallel Similarity Search on Tensors

Published: 01 Jan 2024, Last Modified: 14 May 2025IEEE Big Data 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Existing similarity search methods, often limited to scalar or vector data, struggle to identify complex patterns found in scientific datasets, such as 2D seismic events or 3D magnetic flux ropes. We introduce TensorSearch, a novel parallel similarity search paradigm designed to identify known patterns in high-dimensional tensors. By directly employing tensor representations, TensorSearch captures intricate pattern structures more effectively than traditional vector-based approaches. Furthermore, its parallel architecture optimizes cache and I/O operations, enabling efficient processing of large-scale scientific data. Our performance evaluations demonstrate that TensorSearch outperforms state-of-the-art vector-based systems like Milvus by up to 10x, and achieves up to a remarkable 55x advantage over custom solution developed in Matlab used by the domain scientists. In these tests, TensorSearch exhibits linear scalability, supporting up to 2240 CPU cores.
Loading