Keywords: Natural Language Processing, African languages, Offensive language detection
TL;DR: Offensive language detection in the three major Nigerian languages
Abstract: The proliferation of online offensive language necessitates the development of ef-
fective detection mechanisms, especially in multilingual contexts. This study ad-
dresses the challenge by developing and introducing novel datasets for hate speech
detection in three major Nigerian languages: Hausa, Yoruba, and Igbo. We col-
lected data from Twitter and manually annotated it to create datasets for each of
the three languages, using native speakers. We used pre-trained language models
to evaluate their efficacy in detecting offensive language in our datasets. The best-
performing model achieved an accuracy of 90%. To further support research in
offensive language detection, we plan to make the dataset and our model publicly
available.
Submission Number: 51
Loading