Hate Speech Detection With LLMs In a Low-Resource Setting

Hate Speech Detection With LLMs In a Low-Resource Setting

ACL ARR 2025 February Submission1537 Authors

13 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: This paper explores the ability of large language models (LLMs) to detect hate speech in low-resource settings, focusing on the Russian language. It specifically evaluates how well models like GPT-3.5 Turbo (further referred to as GPT-3.5) and LLaMA~2 can classify hate speech against LGBTQ+ individuals and Ukrainian war refugees. Zero-shot, few-shot, and fine-tuning methods are applied to assess model performance in non-English contexts. High-quality data sets were mainly sourced from Russian social media to address the lack of labelled hate speech data. While LLMs have some success, they struggle due to the dominance of English in their training data. Fine-tuning and instruction-based methods show promise for improving classification accuracy. The study highlights the need for specialized data and training to boost performance in under-represented languages.

Paper Type: Short

Research Area: Efficient/Low-Resource Methods for NLP

Research Area Keywords: NLP, hate speech detection, transformer models, data processing

Contribution Types: Approaches to low-resource settings, Data resources, Data analysis

Languages Studied: Russian

Submission Number: 1537

Loading