Keywords: Arabic, COVID-19, dataset, tweets, NLP
TL;DR: This paper presents a dataset called AROT-COV23, which means ARabic Original Tweets on COVID-19 as of 2023.
Abstract: This paper presents a dataset called AROT-COV23 (ARabic Original Tweets on COVID-19 as of 2023) containing about 500,000 original Arabic COVID-19-related tweets from January 2020 to January 2023. The dataset has been analyzed using a corpus-based approach to identify common themes and trends in the data and gain insights into the ways in which Arabic Twitter users have discussed the pandemic. The results of the analysis are also presented and discussed in terms of their implications for the field of Natural Language Processing (NLP) in Africa and for understanding the role of Twitter in the spread of COVID-19-related information in the region.
0 Replies
Loading