Learning Query-Space Document Representations for High-Recall Retrieval

Published: 01 Jan 2023, Last Modified: 26 Apr 2025ECIR (2) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Recent studies have shown that significant performance improvements reported by neural rankers do not necessarily extend to a diverse range of queries. There is a large set of queries that cannot be effectively addressed by neural rankers primarily because relevant documents to these queries are not identified by first-stage retrievers. In this paper, we propose a novel document representation approach that represents documents within the query space, and hence increases the likelihood of recalling a higher number of relevant documents. Based on experiments on the MS MARCO dataset as well as the hardest subset of its queries, we find that the proposed approach shows synergistic behavior to existing neural rankers and is able to increase recall both on MS MARCO dev set queries as well as the hardest queries of MS MARCO.
Loading