Abstract: Pre-trained language models derive substantial
linguistic and factual knowledge from the mas-
sive corpora on which they are trained, and
prompt engineering seeks to align these mod-
els to specific tasks. Unfortunately, existing
prompt engineering methods require signifi-
cant amounts of labeled data, access to model
parameters, or both. We introduce a new
method for selecting prompt templates without
labeled examples and without direct access to
the model. Specifically, over a set of candi-
date templates, we choose the template that
maximizes the mutual information between
the input and the corresponding model output.
Across 8 datasets representing 7 distinct NLP
tasks, we show that when a template has high
mutual information, it also has high accuracy
on the task. On the largest model, selecting
prompts with our method gets 90% of the way
from the average prompt accuracy to the best
prompt accuracy and requires no ground truth
labels.
0 Replies
Loading