Abstract: Researchers often rely on humans to code (label, annotate, etc.) large sets of texts. This is a highly variable task and requires a great deal of time and resources. Efforts to automate this process have achieved human-level accuracies in some cases, but often rely on thousands of hand-labeled training examples, which makes them inapplicable to small-scale research studies and still costly for large ones. At the same time, it is well known that language models can classify text; in this work, we use GPT-3 as a synthetic coder, and compare it to human coders using classic methodologies and metrics, such as intercoder reliability. We find that GPT-3 can match the performance of typical human coders and frequently outperforms them in terms of intercoder agreement across a variety of social science tasks, suggesting that language models could serve as useful coders.
Paper Type: long
0 Replies
Loading