Abstract: Obtaining and annotating data can be expensive and time-consuming, especially in complex, low-resource domains. By comparing augmented data synthetically generated via Llama-2 and GPT-4 with human-labeled data, we explore the impact of training data sizes on ten different computational social science classification tasks with varying complexity. We find that models trained on human-labeled data often demonstrate superior or comparable performance over their synthetically augmented counterparts, although synthetic augmentation helps particularly on rare classes in multi-class tasks. We also use GPT-4 and Llama-2 for zero-shot classification and find that, despite their generally strong performance, they are often comparable or even inferior to specialized classifiers trained on modest-sized training sets.
Paper Type: short
Research Area: NLP Applications
Contribution Types: NLP engineering experiment
Languages Studied: English, Danish
0 Replies
Loading