Abstract: Data-driven methods for automatic singing quality assessment have so far focused on obtaining an overall singing assessment score of a given singing rendition. However, the explainability of such a score in terms of musically relevant components of singing quality such as intonation accuracy and rhythm correctness has not been attempted due to the lack of annotated training data. In this work, we propose to augment a singing vocals dataset, containing only professional singing renditions, with negative samples for improving the diversity in singing quality examples in the training data. We validate this augmented dataset through listening tests. Moreover, we use this data to formulate a multi-task learning framework that can simultaneously provide pitch accuracy feedback along with an overall singing quality score for a given singing rendition. We show that our methods outperform existing systems for both unseen songs and singers singing English and Mandarin popular songs.
0 Replies
Loading