Exploring the Practicality of Generative Retrieval on Dynamic CorporaDownload PDF

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone
TL;DR: We conduct a comprehensive comparison of the practicality of Dual Encoders and Generative Retrievals on dynamic corpora to better align with real-world scenarios where knowledge is ever-evolving.
Abstract: Benchmarking the performance of information retrieval (IR) is mostly conducted with a fixed set of documents (static corpora). However, in realistic scenarios, this is rarely the case and the documents to be retrieved are constantly updated and added. In this paper, we focus on conducting a comprehensive comparison between two categories of contemporary retrieval systems, Dual Encoders (DE) and Generative Retrievals (GR), in a dynamic scenario where the corpora are updated. We also conduct an extensive evaluation of computational and memory efficiency, crucial factors for real-world deployment of IR systems handling vast and ever-changing document collections. Our results on the StreamingQA benchmark demonstrate that GR is more adaptable to evolving knowledge (+ 4 - 11%), robust in handling data with temporal information, and efficient in terms of memory (X 4), indexing time (X 6), and inference FLOPs (X 2). Our paper highlights the potential of GR for future use in practical IR systems.
Paper Type: long
Research Area: Information Retrieval and Text Mining
Contribution Types: Model analysis & interpretability, Position papers
Languages Studied: English
0 Replies

Loading