Document Aware Contrastive Learning Approach for Generative Retrieval

Published: 31 May 2024, Last Modified: 18 Jun 2024Gen-IR_SIGIR24EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Generative Retrieval, Natural Language Processing, Information Retrieval.
TL;DR: Improving generative retrieval models through direct query-document interaction via document-aware training.
Abstract: The field of neural retrieval models has seen significant advancements with the rise of dense retrieval techniques, where documents are indexed based on embeddings from neural networks, particularly transformer-based models. Generative retrieval, a newer paradigm, employs sequence-to-sequence models to generate unique identifier strings for passages, providing an alternative approach to retrieval. However, these generative models primarily focus on learning the mappings between the queries and the identifiers, which may limit their ability to fully learn the relationship between queries and passages. This work introduces a novel method inspired by dense retrieval to enhance the learning of context-aware representations, thereby fostering a deeper understanding of the relationship between queries and documents. Additionally, a curriculum-based learning strategy is proposed to optimize contrastive losses effectively. Extensive experiments were conducted using the publicly available Natural Questions dataset to evaluate the proposed approach. The results demonstrate modest improvements in performance across all metrics, highlighting the robustness of the method.
Submission Number: 16
Loading