BambaraMLLM: A Unified Multilingual Multimodal Large Language Model for Comprehensive Bambara Language Processing

Published: 27 Jan 2026, Last Modified: 17 Feb 2026AfricaNLP 2026EveryoneRevisionsBibTeXCC BY 4.0
Abstract: BambaraMLLM is a unified multilingual multimodal large language model (MMLLM) designed to address the critical lack of digital resources for Bambara, a West African language spoken by over 15 million people. Unlike traditional approaches that rely on task-specific models for different linguistic functions, BambaraMLLM integrates text generation, automatic speech recognition (ASR), machine translation (MT), and text-to-tpeech (TTS) synthesis into a single, transformer-based architecture. This work establishes a scalable, open-source foundation for African language technology, optimizing for both performance and deployment under resource constraints.
Submission Number: 53
Loading