Keywords: prompting, pre-trained language model
Abstract: Consistently scaling pre-trained language models (PLMs) imposes substantial burdens on model adaptation, necessitating more efficient alternatives to conventional fine-tuning. Given the advantage of prompting in the zero-shot setting and the observed performance fluctuation among different prompts, we explore the instance-level prompt and the achievable upper-bound performance. We first validate the assumption that for every instance, there is almost always a lottery prompt that induces the correct prediction from the given PLM without tuning a single parameter. Meanwhile, it is shown that some strong prompts have high performance over the whole training set. Then, we attempt to generalize the strong prompts from the training set to the test set with ensembling methods. Experiments are conducted on various types of NLP classification tasks and demonstrate that the proposed method can achieve considerably better performance than strong optimization-based baselines by at least 3% (up to 17%) in average metric.
Paper Type: long
Research Area: Interpretability and Analysis of Models for NLP
0 Replies
Loading