On the Importance of Nuanced Taxonomies for LLM-Based Understanding of Harmful Events: A Case Study on Antisemitism

ACL ARR 2024 June Submission1490 Authors

14 Jun 2024 (modified: 07 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Monitoring the news at scale for incidents of hate, violence, and other toxicity is essential to understanding broad societal trends, including harms to marginalized communities. As large language models (LLMs) become a primary tool for understanding events at scale, they can be useful for elucidating these harms. However, labeling harmful events is challenging due to the subjectivity of labels such as "toxicity" and "hate." Motivated by the rise of antisemitism, this paper presents a case study of the capability of LLMs to discover reports of antisemitic events. We pilot the task of hateful event classification on the AMCHA Corpus, a continuously updated dataset with expert-labeled instances of 3 coarse-grained categories and 14 fine-grained types of antisemitism. We show that incorporating domain knowledge from fine-grained taxonomies is needed to make LLMs more effective at this task. Our experiments find that providing precise definitions from a fine-grained taxonomy of antisemitism can steer GPT-4 and Llama-3 to perform better on tagging antisemitic event descriptions to a limited extent, with GPT-4 achieving up to a 14\% increase in mean weighted F1. However, our results suggest that LLMs are still far from perfect at understanding antisemitic events, suggesting avenues for future work on more creative LLM alignment and more policy work on creating precise definitions of antisemitism.
Paper Type: Long
Research Area: Computational Social Science and Cultural Analytics
Research Area Keywords: antisemitism, toxicity detection, hateful event understanding, large language models
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources, Data analysis
Languages Studied: English
Submission Number: 1490
Loading