AWARE-Hate: Joint Sense–Subjectivity for Word-Level Hate Speech Detection

ACL ARR 2026 January Submission4155 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: word-level hate speech detection, word-in-context, lexical sense modeling, annotator subjectivity, contrastive learning, reinforcement learning
Abstract: Word-level hate speech detection requires modeling both contextual meaning and annotator perspectives, yet current methods often overlook definitional sense and annotator subjectivity. We propose Aware-Hate, a framework integrating dictionary definitions and annotator profiles. Our two-stage training establishes classification capability through initial supervised learning, then refines predictions via RL-based alignment. Experimental results demonstrate superior performance over fine-tuned LLMs, with ablations verifying that joint modeling of lexical sense and annotator subjectivity enhances detection efficacy.
Paper Type: Long
Research Area: Computational Social Science, Cultural Analytics, and NLP for Social Good
Research Area Keywords: hate-speech detection, language/cultural bias analysis, sociolinguistics, NLP tools for social analysis
Contribution Types: Model analysis & interpretability, NLP engineering experiment
Languages Studied: English
Submission Number: 4155
Loading