Abstract: This paper explores the ability of large language models (LLMs) to detect hate speech in low-resource settings, focusing on the Russian language. It specifically evaluates how well models like GPT-3.5 Turbo (further referred to as GPT-3.5) and LLaMA~2 can classify hate speech against LGBTQ+ individuals and Ukrainian war refugees. Zero-shot, few-shot, and fine-tuning methods are applied to assess model performance in non-English contexts. High-quality data sets were mainly sourced from Russian social media to address the lack of labelled hate speech data. While LLMs have some success, they struggle due to the dominance of English in their training data. Fine-tuning and instruction-based methods show promise for improving classification accuracy. The study highlights the need for specialized data and training to boost performance in under-represented languages.
Paper Type: Short
Research Area: Efficient/Low-Resource Methods for NLP
Research Area Keywords: NLP, hate speech detection, transformer models, data processing
Contribution Types: Approaches to low-resource settings, Data resources, Data analysis
Languages Studied: Russian
Submission Number: 1537
Loading