Identifying language from songs

Himadri Mukherjee, Ankita Dhar, Sk Md Obaidullah, KC Santosh, Santanu Phadikar, Kaushik Roy

Published: 2021, Last Modified: 07 Nov 2024Multim. Tools Appl. 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Audio signal-based applications have significantly evolved over the last decade from speech recognizers to audio-based search engines, and healthcare is no exception. It also holds true when multimedia content needs to be analyzed. One of the most popular and rapidly increasing sources of multimedia is music that can be in either audio or video format. To efficiently retrieve data, such ever-increasing information demands for different indexing and categorization techniques. The automated song search engines can benefit largely from a language identifier that can segregate songs by the language used. In this paper, we propose to identify the language of songs using Line Spectral Frequency-Approximation Gradation (LSF-AG) features and an ensemble learning-based classification technique. Ensemble learning was used due to its better generalization ability. Using 70+ hours of data for three different languages: English, Bangla, and Hindi, in our experiments, we achieved the highest average accuracy of 98.61% that outperforms standard techniques. Further, the robustness of the system was tested by taking noisy datasets into account.