Submission Type: Regular Long Paper
Submission Track: Information Retrieval and Text Mining
Keywords: Retrieval Augmented Language Model, Zero-shot Dense Retrieval, Mixture of Memory
TL;DR: We propose a novel retrieval augmented language model to extract knowledge from diverse memory corpora for zero-shot dense retrieval.
Abstract: In this paper we improve the zero-shot generalization ability of language models via Mixture-Of-Memory Augmentation (MoMA), a mechanism that retrieves augmentation documents from multiple information corpora (external memories), with the option to ''plug in'' unseen memory at inference time.
We develop a joint learning mechanism that trains the augmentation component with latent labels derived from the end retrieval task, paired with hard negatives from the memory mixture.
We instantiate the model in a zero-shot dense retrieval setting by augmenting strong T5-based retrievers with MoMA.
With only T5-base, our model obtains strong zero-shot retrieval accuracy on the eighteen tasks included in the standard BEIR benchmark, outperforming some systems with larger model sizes.
As a plug-in-play model, our model can efficiently generalize to any unseen corpus, meanwhile achieving comparable or even better performance than methods relying on target-specific pretraining.
Our analysis further illustrates the necessity of augmenting with mixture-of-memory for robust generalization, the benefits of augmentation learning, and how MoMA utilizes the plug-in memory at inference time without changing its parameters.
Our code can be found at https://github.com/gesy17/MoMA.
Submission Number: 2195
Loading