Abstract: The use of propagandistic techniques in online content has increased in recent years, aiming to manipulate online audiences. Although essential for more informed content consumption; very limited focus has been given to the task of extracting textual spans where propaganda techniques are used. Our study focuses on that task by investigating whether large language models (LLMs), such as GPT-4, can effectively extract these spans. We further study the potential of employing the model to collect more cost-effective annotations. Our experiments use a large-scale in-house manually annotated dataset. The results suggest that providing more annotation context to the model as prompts improves its performance compared to human annotations. Moreover, our work is the first to show the potential of utilizing LLMs to develop annotated datasets for this complex task, prompting it with annotations from human annotators with limited expertise. All annotations will be shared with the community.
Paper Type: short
Research Area: Resources and Evaluation
Contribution Types: Approaches to low-resource settings, Data resources
Languages Studied: Arabic
0 Replies
Loading