INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages

ACL ARR 2025 February Submission753 Authors

11 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Slot-filling and intent detection are well-established tasks in Conversational AI. However, current large-scale benchmarks for these tasks often exclude evaluations of low-resource languages and rely on translations from English benchmarks, thereby predominantly reflecting Western-centric concepts. In this paper, we introduce "INJONGO" - a multicultural, open-source benchmark dataset for 16 African languages with utterances generated by native speakers across diverse domains, including banking, travel, home, and dining. Through extensive experiments, we benchmark fine-tuning multilingual transformer models and prompting large language models (LLMs), and show the advantage of leveraging African-cultural utterances over Western-centric utterances for improving cross-lingual transfer from the English language. Experimental results reveal that current LLMs struggle with the slot-filling task, with GPT-4o achieving an average performance of 26 F1. In contrast, intent detection performance is notably better, with an average accuracy of 70.6%, though it still falls short of fine-tuning baselines. When compared to the English language, GPT-4o and fine-tuning baselines perform similarly on intent detection, achieving an accuracy of approximately 81%. Our findings suggest that LLMs performance is still behind for many low-resource African languages, and more work is needed to further improve their downstream performance.
Paper Type: Long
Research Area: Multilingualism and Cross-Lingual NLP
Research Area Keywords: Information Extraction, Dialogue and Interactive Systems, Multilingualism and Cross-Lingual NLP, Resources and Evaluation
Contribution Types: Publicly available software and/or pre-trained models, Data resources, Data analysis
Languages Studied: Amharic, Ewe, Hausa, Igbo, Kinyarwanda, Lingala, Luganda, Oromo, Shona, Sesotho, Swahili, Twi, Wolof, Xhosa, Yoruba, Zulu, English
Submission Number: 753
Loading