Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Learning Audio Features for Singer Identification and Embedding
Nov 03, 2017 (modified: Nov 03, 2017)ICLR 2018 Conference Blind Submissionreaders: everyoneShow Bibtex
Abstract:There has been increasing use of neural networks for music information retrieval tasks. In this paper, we empirically investigate different ways of improving the performance of convolutional neural networks (CNNs) on spectral audio features. More specifically, we explore three aspects of CNN design: depth of the network, the use of residual blocks along with the use of grouped convolution, and global aggregation over time. The application context is singer classification and singing performance embedding and we believe the conclusions extend to other types of music analysis using convolutional neural networks. The results show that global time aggregation helps to improve the performance of CNNs the most. Another contribution of this paper is the release of a singing recording dataset that can be used for training and evaluation.
TL;DR:Using deep learning techniques on singing voice related tasks.
Keywords:convolution neural networks, attention, music information retrieval
Enter your feedback below and we'll get back to you as soon as possible.