Improving Automatic Singing Skill Evaluation with Timbral Features, Attention, and Singing Voice Separation

Yaolong Ju, Chunyang Xu, Yichen Guo, Jinhu Li, Simon Lui

Published: 2023, Last Modified: 23 Mar 2026ICME 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Most automatic singing skill evaluation (ASSE) models focus only on solo singing, resulting in a limited application scope since singing is usually mixed with instrumental accompaniment in music. In this paper, we propose a more general ASSE model which applies to both solo singing and singing with accompaniment. For this purpose, we employ an existing singing voice separation tool for accompaniment removal and compare ASSE models trained with and without accompaniment. Results show that accompaniment removal achieves better performances. Furthermore, we explore different features and model architectures, concluding that the additions of timbral features, attention mechanism, and dense layer further improve the performance. Finally, we show that our proposed model achieves a Pearson correlation coefficient of 0.562, a 62.4% relative improvement compared to 0.346 for the baseline model.
Loading