HypeR: Multitask Hyper-Prompted Training Enables Large-Scale Retrieval Generalization

ZeFeng Cai; Chongyang Tao; Tao Shen; Can Xu; Xiubo Geng; Xin Alex Lin; Liang He; Daxin Jiang

HypeR: Multitask Hyper-Prompted Training Enables Large-Scale Retrieval Generalization

ZeFeng Cai, Chongyang Tao, Tao Shen, Can Xu, Xiubo Geng, Xin Alex Lin, Liang He, Daxin Jiang

Published: 01 Feb 2023, Last Modified: 01 Mar 2023ICLR 2023 posterReaders: Everyone

Keywords: Uniformed Large-Scale Retrieval, Multi-Task hyper-prompted training, Retrieval Generalization

Abstract: Recently, large-scale text retrieval has made impressive progress, facilitating both information retrieval and downstream knowledge-intensive tasks (e.g., open-domain QA and dialogue). With a moderate amount of data, a neural text retriever can outperform traditional methods such as BM25 by a large step. However, while being applied to out-of-domain data, the performance of a neural retriever degrades considerably. Therefore, how to enable a retriever to perform more robustly across different domains or tasks and even show strong zero-shot transfer ability is critical for building scalable IR systems. To this end, we propose HypeR, a hyper-prompted training mechanism to enable uniform retrieval across tasks of different domains. Specifically, our approach jointly trains the query encoder with a shared prompt-based parameter pool and a prompt synthesizer that dynamically composes hyper-prompt for encoding each query from different tasks or domains. Besides, to avoid the mode collapse of prompt attention distribution for different queries, we design a contrastive prompt regularization that promotes the mode of prompt attention to be aligned and uniform. Through multi-task hyper-prompted training, our retriever can master the ability to dynamically represent different types of queries and transfer knowledge across different domains and tasks. Extensive experiments show our model attains better retrieval performance across different tasks and better zero-shot transfer ability compared with various previous methods.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Applications (eg, speech processing, computer vision, NLP)

TL;DR: A multitask hyper-prompted training mechanism that enables a neural retriever to dynamically process different types of queries with different hyper-prompts and transfer learned knowledge across different domains and tasks.

13 Replies

Loading