Applying Latent Dirichlet Allocation Technique to Classify Topics on Sustainability Using Arabic Text
Abstract: In this paper, we build up on the existing literature pertaining topic modelling and sustainability by exploring Arabic text, mapping the Sustainability Development Goals (SDGs) presented by the United Nation to the tweets published in Arabic. The work utilized the popular Latent Dirichlet Allocation (LDA) technique, to summarize and present subtopics that matter to various sustainability areas, with a focus on 3 of the 17 Sustainability Development Goals. Term Weighting Scheme using TF-IDF and a document term matrix extracted to highlight the most influential keywords that formed the topics. The work presented a unique set of topics and terms that correlate with the certain areas of sustainability. Further exploration of Arabic sources, will inform people concerned with sustainability on the various issues related to sustainable development in the Arab World. The work presented in this paper is a step towards formalizing a framework that will capture and analyze various aspects of unstructured data revolving around sustainability.
External IDs:dblp:conf/sai/QudahHSCM22
Loading