Abstract: Prompt tuning is one of the successful approaches for parameter-efficient tuning of pre-trained language models. Despite being arguably the most parameter-efficient (tuned soft prompts constitute $<0.1\%$ of total parameters), it typically performs worse than other efficient tuning methods and is quite sensitive to hyper-parameters. In this work, we introduce \textsc{Residual Prompt Tuning} -- a simple and efficient method that significantly improves the performance and stability of prompt tuning. We propose to reparameterize soft prompt embeddings using a shallow network with a residual connection. Our experiments show that \textsc{Residual Prompt Tuning} significantly outperforms prompt tuning on SuperGLUE benchmark %boosts the performance of prompt tuning across eight tasks
across T5-Large, T5-Base and BERT-Base models. Notably, our method reaches $+7$ points improvement over prompt tuning with T5-Base and allows to reduce the prompt length by $\times 10$ without hurting performance. In addition, we show that our approach is robust to the choice of learning rate and prompt initialization, and is effective in few-shot settings.
0 Replies
Loading