Abstract: In the field of natural language processing (NLP), prompt-based learning is widely used for efficient parameter learning. However, this method has the drawback of shortening the input length by the extent of the attached prompt, leading to an inefficiency in utilizing the input space. In this study, we propose P-Distill, a novel prompt compression method that mitigates the aforementioned limitation of prompt-based learning while maintaining performance via knowledge distillation. The knowledge distillation process of P-Distill consists of two methods, namely prompt initialization and prompt distillation. Experiments on various NLP tasks demonstrate that P-Distill exhibits comparable or superior performance compared to other state-of-the-art prompt-based learning methods, even with significantly shorter prompts. Specifically, We achieve a peak improvement of 1.90% even with the prompt lengths compressed to one-eighth. An additional study further provides insights into the distinct impact of each method on the overall performance of P-Distill. Our code will be released upon acceptance.
Paper Type: Long
Research Area: Special Theme (conference specific)
Research Area Keywords: distillation, parameter-efficient-training
Contribution Types: NLP engineering experiment, Approaches low compute settings-efficiency
Languages Studied: English
Submission Number: 2539
Loading