DeHate: A Holistic Hateful Video Dataset for Explicit and Implicit Hate Detection
Abstract: Hate speech poses a persistent threat to society, causing profound harm to both individuals and communities. Detecting such content is essential for promoting safer and more inclusive environments. While previous research has primarily focused on text-based or image-based hate speech detection, video-based hate detection remains relatively underexplored. A key barrier is the limited availability of high-quality video datasets. Existing hateful video datasets are typically limited in scale, diversity, and annotation depth, often labeling hateful content without further distinguishing between explicit and implicit forms. In this work, we present DeHate \footnote{https://github.com/yuchen-zhang-essex/\textsc{DeHate}}, which, to the best of our knowledge, is the largest hateful video dataset to date. DeHate comprises 6689 videos collected from two platforms and spanning six social groups. Each video is annotated with fine-grained labels that differentiate explicit, implicit, and non-hateful content, along with segment-level localization of hate, identification of contributing modalities, and specification of the targeted groups. Through detailed analysis of annotated videos across platforms, we reveal distinct patterns in how hateful content is conveyed, offering a comprehensive comparison between explicit and implicit hate in terms of their prevalence and characteristics. Furthermore, we benchmark state-of-the-art models, including both uni-modal and multi-modal architectures, and identify persistent challenges in detecting subtle and context-dependent forms of hate. Our findings highlight the importance of holistic and fine-grained hateful video datasets for advancing research in hate speech detection.
Loading