An Information-theoretic Approach to Prompt Engineering Without Ground Truth LabelsDownload PDF

24 Apr 2023OpenReview Archive Direct UploadReaders: Everyone
Abstract: Pre-trained language models derive substantial linguistic and factual knowledge from the mas- sive corpora on which they are trained, and prompt engineering seeks to align these mod- els to specific tasks. Unfortunately, existing prompt engineering methods require signifi- cant amounts of labeled data, access to model parameters, or both. We introduce a new method for selecting prompt templates without labeled examples and without direct access to the model. Specifically, over a set of candi- date templates, we choose the template that maximizes the mutual information between the input and the corresponding model output. Across 8 datasets representing 7 distinct NLP tasks, we show that when a template has high mutual information, it also has high accuracy on the task. On the largest model, selecting prompts with our method gets 90% of the way from the average prompt accuracy to the best prompt accuracy and requires no ground truth labels.
0 Replies

Loading