TL;DR: We use models to geolocate audio from speech and show applications of such models to language identification
Abstract: In this paper we explore training models to answer the question, Where are you from? at a global scale. In other words, we are training models to geolocate speech based on language, accent and dialect. By leveraging radio broadcasts with known geographic locations, we train interpretable models for geolocation from audio and demonstrate that solving this task also provides a simple, but novel method for language identification (LID). We show that our method can outperform standard self-supervised models.
Paper Type: long
Research Area: Speech recognition, text-to-speech and spoken language understanding
Contribution Types: NLP engineering experiment, Approaches to low-resource settings
Languages Studied: US English, Latin American Spanish, Brazilian Portuguese, French, Polish, Macedonian, Russian, Malayalam, Hong Kong Yue, Filipino, and Japanese
0 Replies
Loading