Contrastive Pre-training for Zero-Shot Information Retrieval

Gautier Izacard; Mathilde Caron; Lucas Hosseini; Sebastian Riedel; Piotr Bojanowski; Armand Joulin; Edouard Grave

Contrastive Pre-training for Zero-Shot Information Retrieval

Gautier Izacard, Mathilde Caron, Lucas Hosseini, Sebastian Riedel, Piotr Bojanowski, Armand Joulin, Edouard Grave

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone

Keywords: information retrieval, contrastive pretraining

Abstract: Information retrieval is an important component in natural language processing, for knowledge intensive tasks such as question answering and fact checking. Recently, information retrieval has seen the emergence of dense retrievers, based on neural networks, as an alternative to classical sparse methods based on term-frequency. Neural retrievers work well on the problems for which they were specifically trained, but they do not generalize as well as term-frequency methods to new domains or applications. By contrast, in many other NLP tasks, conventional self-supervised pre-training based on masking leads to strong generalization with small number of training examples. We believe this is not yet the case for information retrieval, because these pre-training methods are not well adapted to this task. In this work, we consider contrastive learning as a more natural pre-training technique for retrieval and show that it leads to models that are competitive with BM25 on many domains or applications, even without training on supervised data. Our dense pre-trained models also compare favorably against BERT pre-trained models in the few-shot setting, and achieves state-of-the-art performance on the BEIR benchmark when fine-tuned on MS-MARCO.

One-sentence Summary: Contrastive pretraining for information retrieval.

11 Replies

Loading