Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners

Ningyu Zhang; Luoqiu Li; Xiang Chen; Shumin Deng; Zhen Bi; Chuanqi Tan; Fei Huang; Huajun Chen

Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners

Ningyu Zhang, Luoqiu Li, Xiang Chen, Shumin Deng, Zhen Bi, Chuanqi Tan, Fei Huang, Huajun Chen

Published: 28 Jan 2022, Last Modified: 22 Jun 2025ICLR 2022 PosterReaders: Everyone

Keywords: prompt-tuning, pre-trained language model, few-shot learning

Abstract: Large-scale pre-trained language models have contributed significantly to natural language processing by demonstrating remarkable abilities as few-shot learners. However, their effectiveness depends mainly on scaling the model parameters and prompt design, hindering their implementation in most real-world applications. This study proposes a novel pluggable, extensible, and efficient approach named DifferentiAble pRompT (DART), which can convert small language models into better few-shot learners. The main principle behind this approach involves reformulating potential natural language processing tasks into the task of a pre-trained language model and differentially optimizing the prompt template as well as the target label with backpropagation. Furthermore, the proposed approach can be: (i) Plugged to any pre-trained language models; (ii) Extended to widespread classification tasks. A comprehensive evaluation of standard NLP tasks demonstrates that the proposed approach achieves a better few-shot performance.

One-sentence Summary: A differentiable prompt learning method for few-shot NLP with optimized prompt templates as well as labels.

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/differentiable-prompt-makes-pre-trained/code)

13 Replies

Loading