PromptTrans: Eliciting In-Context Learning from Smaller Language Models by Translating Demonstrations to Prompts
Keywords: in-context learning, meta-training, prompt tuning, representation learning
TL;DR: elicit in-context from smaller LMs by translating demonstrations to prompts
Abstract: Large language models (LMs) like GPT-3 have shown remarkable in-context learning ability: by concatenating demonstration examples as the input context, the model is able infer on an unseen task without further training. For smaller LMs like T5-large, however, in-context learning performance is abysmally poor. This poses a question on the design of the in-context learning framework - Is there a better way to condition on demonstrations, rather than simply concatenating them in text? Towards this, we propose PromptTrans, a parameter-efficient tuning framework (3.4% of backbone LM parameters) to translate demonstrations to soft prompts and augment the input context with the translated soft prompts. We meta-train PromptTrans on 120 tasks and evaluate on 40 unseen tasks from the CrossFit dataset, with few-shot (<= 16) demonstrations per task. Through our experiments, we show that PromptTrans is indeed instrumental in eliciting in-context learning ability in smaller LMs (T5-large), without updating any parameter of the backbone. A particularly interesting finding is that across our extensive experiments PromptTrans consistently outperforms baselines that meta-train the whole backbone LM for in-context learning and even large off-the-shelf LMs (with 16.8x parameters). Our promising results and analysis throw light on how even smaller LMs can learn in-context and alludes towards a more effective in-context learning paradigm.
0 Replies
Loading