TCNSpeech: A Community-Curated Speech Corpus for Sermons

Wuraola Fisayo Oyewusi; Sharon Ibejih; Soromfe Uzomah; Elizabeth Mawutin Joseph; Jon Cynthia; Folakunmi Ojemuyiwa; Benedicta Johnson-Onuigwe; Omolola Taiwo; Akintunde Akinpelumi; Olabisi Adesina; Ayodele Noutouglo; Adeola Adeleke Adeoba; Andrew Akoh; Chukwuemeka Nwachukwu; Opeyemi Agbabiaje; Itunu Falade; Olukemi Erhunmwunsee; Oluwatobiloba Dada; Olúwatóbi David OSIBELUWO; Ehis Akene; Udim Akpan; Moira Amadi-Emina; Jaiyeola Marquis; Michael Senapon Bojerenu; Gbolahan Olumade; Oluwagbemi Lesi; Timothy Ezeh; Oluwadamilola Oguntoyinbo; Tosan Mogbeyiteren; Felicia Oresanya; Samuel Chika; Sodiq Akinjobi

TCNSpeech: A Community-Curated Speech Corpus for Sermons

Published: 08 Apr 2022, Last Modified: 05 May 2023AfricaNLP 2022Readers: Everyone

Keywords: ASR, Sermon, Speech To Text, Nigerian, TCNSpeech

TL;DR: A dataset of sermons curated by a community for Automatic Speech Recognition tasks

Abstract: In this work we present TCNSpeech, a community-curated multispeaker sermon corpus for speech recognition tasks. It contains a total of 24 hours of English audio data recording, chunked and transcribed. The context of the dataset is domain-specific for sermons in Nigerian English accent and a use case for community data curation. The dataset will be made publicly available.

1 Reply

Loading