Balancing the Blend: An Experimental Analysis of Trade-offs in Hybrid Search

Published: 2025, Last Modified: 15 Jan 2026CoRR 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Hybrid search, the integration of lexical and semantic retrieval, has become a cornerstone of modern information retrieval systems, driven by demanding applications like Retrieval-Augmented Generation (RAG). The architectural design space for these systems is vast and complex, yet a systematic understanding of the trade-offs among their core components -- retrieval paradigms, combination schemes, and re-ranking methods -- is lacking. To address this, and informed by our experience building the Infinity open-source database, we present the first experimental analysis of advanced hybrid search architectures. Our framework integrates four retrieval paradigms -- Full-Text Search (FTS), Sparse Vector Search (SVS), Dense Vector Search (DVS), and Tensor Search (TenS) -- and evaluates their combinations and re-ranking strategies across 11 real-world datasets. Our results reveal three key findings: (1) A "weakest link" phenomenon, where a weak path can substantially degrade overall accuracy, highlighting the need for path-wise quality assessment before fusion. (2) A data-driven map of performance trade-offs, demonstrating that optimal configurations depend heavily on resource constraints and data characteristics, precluding a one-size-fits-all solution. (3) The identification of Tensor-based Re-ranking Fusion (TRF) as a high-efficacy alternative to mainstream fusion methods, offering the semantic power of tensor search at a fraction of the computational and memory cost. Our findings offer concrete guidelines for designing adaptive, scalable hybrid search systems and identify key directions for future research.
Loading