Abstract: Hate speech detection in low-resource languages remains a significant challenge due to the scarcity of annotated datasets. We introduce NepX-Hate, a new benchmark dataset for hate speech detection in low-resource languages, centered on Nepali with an auxiliary Hindi subset for cross-lingual experiments. The dataset comprises 10,000 annotated tweets labeled across multiple dimensions: hate speech presence, fine-grained category (e.g., casteism, xenophobia), offensiveness, target type, and sentiment. NepX-Hate is the first publicly available hate-speech dataset with multi-aspect sociocultural annotations, covering general social media discourse beyond prior domain-specific efforts. We provide benchmarks across traditional classifiers and multilingual transformer models, revealing challenges in detecting implicit hate and highlighting how fine-grained labels aid model interpretability. NepX-Hate provides a comprehensive testbed for hate speech research in underrepresented languages, enabling both sociocultural analysis and multilingual transfer. We release the dataset and code publicly, aiming to support robust, explainable hate speech detection in the Global South.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: hate-speech detection, corpus creation, benchmarking, language resources, multilingual corpora, datasets for low resource languages
Contribution Types: Model analysis & interpretability, Data resources, Data analysis
Languages Studied: Nepali, Hindi
Submission Number: 8036
Loading