Abstract: In this article, we deal with the task of hostile post detection in Hindi. The objective is to predict whether a social media post is hostile or not. Furthermore, if the post is hostile, we identify one or more fine-grained hostile dimensions out of the following four—fake, hate, offensive, and defamation. We propose HostileNet , a novel deep-learning framework that leverages HindiBERT-based contextual representations and hand-crafted features like lexicon, emoticon, and hashtag embeddings for hostile post classification. Moreover, we also propose a novel mechanism to fine-tune HindiBERT’s attention vectors with respect to each hostile dimension. We evaluate HostileNet on the CONSTRAINT-2021 shared task dataset on hostile post detection in Hindi for both coarse-grained (hostile versus nonhostile) and fine-grained (fake versus hate versus offensive versus defamation) setups. HostileNet outperforms the best-performing system as reported in the CONSTRAINT-2021 shared task for both the setups. Furthermore, we provide a thorough analysis of the obtained results in the form of an ablation study, error analysis, attention heatmap analysis, lexicon feature analysis, and so on. We also perform in-the-wild evaluation and conduct a user survey to assess the robustness of our proposed model. We make the code and the curated multilabel hostile lexicon available for research use at https://github.com/LCS2-IIITD/HostileNet .
Loading