Improving Instruction-Aware Retrieval with Query-Preserving Regularization

Hyewon Kim, Hyun-Je Song

Published: 2026, Last Modified: 30 Apr 2026ECIR (2) 2026EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Instruction-aware retrievers incorporate natural language instructions to express fine-grained retrieval constraints beyond the original query. These retrievers are typically trained using contrastive learning that considers relevance signals from both standard queries and instruction-augmented queries. However, prior instruction-aware retrievers learn instruction-augmented queries solely from document relevance signals, without explicitly preserving the semantics of the original query. As a result, instruction signals can dominate query semantics during training, leading to retrieved results that either fail to follow the instruction or are irrelevant to the original query. To address this issue, we propose a query-preserving regularization that enforces consistency between the relevance distributions induced by the original query and by the query component within the instruction-augmented query. This regularization prevents instruction signals from dominating query semantics while still allowing instructions to refine relevance estimation. Experiments on two instruction following retrieval benchmarks demonstrate that our method improves the existing state-of-the-art instruction-aware retriever. Furthermore, our model achieves strong performance on standard retrieval tasks without instructions, in both in domain and out of domain scenarios.
Loading