Griots Interviews, Bambara Language WAV, 30 hours, Recorded 2022, Cultural and ASR Training Resource

Sebastien Diarra, Michael Leventhal

Published: 16 Aug 2022, Last Modified: 07 Jan 2026ZenodoEveryoneRevisionsCC BY-SA 4.0
Abstract: <strong>Source material to this project:</strong> Addition to 200,000 lines Bambara-French clean synchronized corpus Co-project with Google, recorded 30 hours video interviews with Griots 30 hours manually transcribed and translated to French 10 hours used in training ASR system and MT transformer 100% Open Sourced Cultural/Technical Exhibition to be hosted online and in the National Museum of Mali Record, preserve, and share Malian culture with the world Contribute to the science of low-resource language NLP Reinforce the development of written Bambara Enable Bambara to reach status as a “first-class internet language” Corresponding transcribed data can be found at the following Github repository
Loading