AWARE-Hate: Joint Sense–Subjectivity for Word-Level Hate Speech Detection

AWARE-Hate: Joint Sense–Subjectivity for Word-Level Hate Speech Detection

ACL ARR 2026 January Submission4155 Authors

05 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: word-level hate speech detection, word-in-context, lexical sense modeling, annotator subjectivity, contrastive learning, reinforcement learning

Abstract: Word-level hate speech detection requires modeling both contextual meaning and annotator perspectives, yet current methods often overlook definitional sense and annotator subjectivity. We propose Aware-Hate, a framework integrating dictionary definitions and annotator profiles. Our two-stage training establishes classification capability through initial supervised learning, then refines predictions via RL-based alignment. Experimental results demonstrate superior performance over fine-tuned LLMs, with ablations verifying that joint modeling of lexical sense and annotator subjectivity enhances detection efficacy.

Paper Type: Long

Research Area: Computational Social Science, Cultural Analytics, and NLP for Social Good

Research Area Keywords: hate-speech detection, language/cultural bias analysis, sociolinguistics, NLP tools for social analysis

Contribution Types: Model analysis & interpretability, NLP engineering experiment

Languages Studied: English

Submission Number: 4155

Loading