Succeeding at Scale: Enterprise Retrieval Benchmark Construction and Index-Preserving Query Adaptation for Multi-Tenant Search
Keywords: Information Retrieval, Dataset Construction, Query-Side Fine-tuning, Parameter-Efficient Fine-Tuning
TL;DR: We introduce Enterprise-Search, an automatically constructed benchmark, and show that parameter-efficient query-only tuning adapts dense retrieval models without re-indexing document corpora
Abstract: Large-scale multi-tenant retrieval systems generate extensive query logs but lack curated relevance labels for effective domain adaptation, resulting in substantial underutilized “dark data.” This challenge is compounded by the high cost of model updates, as jointly fine-tuning query and document encoders requires full corpus re-indexing, which is impractical in multi-tenant settings with thousands of isolated indices. We introduce Enterprise-Search, a passage retrieval benchmark for technical customer support built via a fully automated pipeline. Candidate generation uses fusion across diverse sparse and dense retrievers, followed by an LLM-as-a-Judge for consistency filtering and relevance labeling. We further study and systematically evaluate index-preserving query-only adaptation strategies that fine-tune only the query-encoder while keeping the document indices fixed. Experiments on Enterprise-Search, SciFact, and FiQA-2018 show that Parameter-Efficient Fine-Tuning of the query encoder delivers a remarkable quality–efficiency trade-off, enabling scalable and practical enterprise multi-tenant retrieval.
Submission Number: 51
Loading