An Information-theoretic Approach to Prompt Engineering Without Ground Truth Labels

David Wingate, Taylor Sorensen, Joshua Robinson, Christopher Michael Rytting, Alexander Glenn Shaw, Kyle Jeffrey Rogers, Alexia Pauline Delorey, Mahmoud Khalil, Nancy Fulda

24 Apr 2023OpenReview Archive Direct UploadReaders: Everyone

Abstract: Pre-trained language models derive substantial linguistic and factual knowledge from the mas- sive corpora on which they are trained, and prompt engineering seeks to align these mod- els to specific tasks. Unfortunately, existing prompt engineering methods require signifi- cant amounts of labeled data, access to model parameters, or both. We introduce a new method for selecting prompt templates without labeled examples and without direct access to the model. Specifically, over a set of candi- date templates, we choose the template that maximizes the mutual information between the input and the corresponding model output. Across 8 datasets representing 7 distinct NLP tasks, we show that when a template has high mutual information, it also has high accuracy on the task. On the largest model, selecting prompts with our method gets 90% of the way from the average prompt accuracy to the best prompt accuracy and requires no ground truth labels.

0 Replies