On the Identification and Forecasting of Hate Speech in InceldomDownload PDF

Anonymous

16 Dec 2022 (modified: 05 May 2023)ACL ARR 2022 December Blind SubmissionReaders: Everyone
Abstract: Spotting hate speech in social media posts is crucial to increase the civility of the Web and has been thoroughly explored in the NLP community. For the first time, we introduce a multilingual corpus for the analysis and identification of hate speech in the domain of inceldom, built from incel Web forums in English and Italian, including expert annotation at the post level for two kinds of hate speech: misogyny and racism. This resource paves the way for the development of mono- and multilingual models for (a)~the identification of hateful posts (binary and multi-label setting) and (b)~the forecasting of the amount of hateful responses that a post is likely to trigger (regression setting). Our models reach an F$_1$ score above 0.85 in the classification settings and MAEs around 0.10 for the forecasting settings. These performances show that it is doable to approximate the extent of hate speech that a full thread is likely to contain, as soon as the first post has been made public ---be it In English or Italian.
Paper Type: short
Research Area: Resources and Evaluation
0 Replies

Loading