Accessing, managing, and mobilizing an ELAN-based language documentation corpus: the Kwaras and Namuti toolsDownload PDF

08 Dec 2023OpenReview Archive Direct UploadReaders: Everyone
Abstract: This paper introduces Kwaras and Namuti, two new tools for building, man- aging, accessing, and mobilizing ELAN-based language documentation corpora. Kwaras integrates WAV files, ELAN annotations, and document metadata into a web-based corpus, allowing immediate access to annotations and recordings. Namuti builds from Kwaras and enables different uses of language documenta- tion products for different audiences and provides links from linguistic analyses to language documentation corpora. The main goal of these new tools is three- fold: (i) to facilitate the use of language documentation in linguistic analysis; (ii) to increase transparency of documentation-based analyses, providing interested users full access to the data on which generalizations are based and contextual- ization of the projects that generated the data; and (iii) to enable uses of language corpora that may serve the interests of multiple stakeholders,including academic researchers and community members interested in language maintenance and re- vitalization. We provide a basic overview of how Kwaras and Namuti work, lay out instructions on how to download and use Kwaras, and discuss what uses it currently supports. This article also issues a call for increased collaboration be- tween linguists,community members,language activists,and software developers to further develop these and other similar resources.
0 Replies

Loading