Online Threats Detection in Hausa LanguageDownload PDF

Published: 03 Mar 2023, Last Modified: 15 Apr 2023AfricaNLP 2023Readers: Everyone
Keywords: Low Resource Language, Hausa Language, Natural Language Processing, Twitter Dataset
Abstract: One of the widely used technological inventions is the Internet which gives rise to online social media platforms such as Twitter and Facebook to proliferate. These platforms are quite instrumental as a means for socialisation and information exchange among diverse users. The use of online social media to spread information can be both beneficial and harmful. From the positive side, the information can be useful in the areas of security, economy and climate change. Motivated by the growing number of online users and widespread availability of contents with the potential of causing harm, this study examines how online contents with threatening themes are being expressed in Hausa language. We collected the first collection of Hausa datasets with threatening contents from Twitter and develop a classification system to help in curtailing security risks by informing decisions on tackling insecurity and related challenges. We employ and train four machine learning algorithms: Random Forest (RF), XGBoost, Decision Tree (DT) and Naive Bayes, to classify the annotated dataset. The result of the classifications shows an accuracy score of 72% for XGBoost, 71% for RF, 67% for DT and Naive Bayes having the lowest of 57%.
0 Replies

Loading