The Current State of the NLP in Sub-Saharan Africa - A Position Paper

ACL ARR 2024 June Submission4044 Authors

16 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: In this study, we present our position on the current state of the NLP in Sub-Saharan Africa. Our position comes from surveys of literature we conducted on NLP activities (research and publications) in Sub-Saharan African languages. We discussed issues of NLP research outcomes for Sub-Saharan Africa based on the results of the survey. These issues include low-quality results of NLP studies, insufficient access to research funding, and lack of an interdisciplinary approach to NLP research in the region's languages. Results of the study reveal that most of the NLP work done for Sub-Saharan African languages from 2020-2023 was centered around corpus development, language modeling, and sentiment analysis. About 61% of the NLP work in sub-Saharan Africa does not have access to funding. Funding sources are mainly NGOs with 66.7% of work that received funding being multilingual studies. However, 64% of NLP activities in the region are monolingual. We finalize our position by providing recommendations on addressing issues raised and discovered based on the survey.
Paper Type: Long
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: low-resources, funding nlp research, sub-Saharan African languages, NLP applications, NLP datasets
Contribution Types: Position papers
Languages Studied: Sub-Saharan African Languages; Hausa, Kiswahili, Amharic, Bantu, Zulu,others
Submission Number: 4044