UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation

Daixuan Cheng; Shaohan Huang; Junyu Bi; Yuefeng Zhan; Jianfeng Liu; Yujing Wang; Hao Sun; Furu Wei; Weiwei Deng; Qi Zhang

UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation

Daixuan Cheng, Shaohan Huang, Junyu Bi, Yuefeng Zhan, Jianfeng Liu, Yujing Wang, Hao Sun, Furu Wei, Weiwei Deng, Qi Zhang

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 MainEveryoneRevisionsBibTeX

Submission Type: Regular Long Paper

Submission Track: Theme Track: Large Language Models and the Future of NLP

Submission Track 2: Information Retrieval and Text Mining

Keywords: Prompt Engineering, Large Language Model, Zero-shot Evaluation

TL;DR: Universal prompt retrieval for improving zero-shot task performance of different LLMs

Abstract: Large Language Models (LLMs) are popular for their impressive abilities, but the need for model-specific fine-tuning or task-specific prompt engineering can hinder their generalization. We propose UPRISE (Universal Prompt Retrieval for Improving zero-Shot Evaluation), which tunes a lightweight and versatile retriever that automatically retrieves prompts for a given zero-shot task input. Specifically, we demonstrate universality in a cross-task and cross-model scenario: the retriever is tuned on diverse tasks, but tested on unseen task types; we use a small frozen LLM, GPT-Neo-2.7B, for tuning the retriever, but test the retriever on different LLMs of much larger scales, such as BLOOM-7.1B, OPT-66B and GPT3-175B. Additionally, we show that UPRISE mitigates the hallucination problem in our experiments with ChatGPT, suggesting its potential to improve even the strongest LLMs. Our model and code are available at https://github.com/microsoft/LMOps.

Submission Number: 462

Loading