On the Importance of Nuanced Taxonomies for LLM-Based Understanding of Harmful Events: A Case Study on Antisemitism

Published: 06 Oct 2024, Last Modified: 12 Nov 2024WiNLP 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: antisemitism, toxicity detection, hateful event understanding, large language models
TL;DR: A stress-test of LLMs' abilities to perform fine-grained antisemitic event classification.
Abstract: Large language models (LLMs) can help elucidate hate, violence, and other toxicity. However, labeling harmful events is challenging due to the subjectivity of labels such as ``toxicity'' and ``hate.'' Motivated by the rise of antisemitism, this paper studies the capability of LLMs to discover reports of antisemitic events. We pilot the task of hateful event classification on the AMCHA Corpus---a continuously updated dataset with expert-labeled instances of fine-grained types of antisemitism---and show that incorporating domain knowledge from fine-grained taxonomies is needed to make LLMs more effective. Our experiments find that providing precise definitions from a taxonomy can steer GPT-4 and Llama-3 to somewhat improve on tagging antisemitic event descriptions, with GPT-4 achieving up to a 14\% increase in mean weighted F1. However, LLMs are still far from perfect at understanding antisemitic events, suggesting avenues for future work on LLM alignment and precise definition of antisemitism.
Submission Number: 58
Loading