Aggression Detection on Multilingual Social Media Text

Shukrity Si, Anisha Datta, Somnath Banerjee, Sudip Kumar Naskar

Published: 2019, Last Modified: 04 May 2025ICCCNT 2019EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: With the advancement of technology, social media such as Facebook, Twitter, etc. plays an important role in communication whether it is texting, sharing photos, audio-video calls or expressing views through comments. Along with these advantages, it has some negative sides as well which brings aggression towards some section of people. Such aggression, hatred in social media needs to be detected and prevented automatically which is the main objective of our work. We have worked on Hindi, English and Hindi-English (code-mixed) datasets. We used features like word vectors, aggressive words (manually created dictionary), sentiment scores, parts of speech and emojis for the classification task. We experimented with several machine learning and deep learning models and the results indicate that XGBoost Classifier, Gradient Boosting Classifier (GBM) and Support Vector Machine (SVM) are most suited for the task. Therefore the output of the three classifiers were used for majority voting which provides f-scores of 68.13, 54.82 and 55.31 for the English, Hindi and code-mixed datasets respectively.