Vocal separation using extended robust principal component analysis with Schatten p/lp-norm and scale compression

Il-Young Jeong, Kyogu Lee

Published: 2014, Last Modified: 28 Apr 2023MLSP 2014Readers: Everyone

Abstract: Separating vocal and accompaniment signals from a monaural music signal is a challenging task. Recently, robust principal component analysis (RPCA) has been proposed for use in the magnitude spectrogram domain to separate the low-rank and sparse residual matrices, which are assumed to represent the accompaniment and vocal signals, respectively. In this paper, we propose two extended methods based on the RPCA algorithm for more effective vocal separation. First, we extend the conventional RPCA and propose to use in the spectrogram decomposition framework Schatten p- and l <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">p</sub> -norms, which are generalized versions of the nuclear norm and l <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> -norm used in RPCA, respectively. Second, we apply proper scale compression to the magnitude spectrogram, making it a more appropriate representation for the decomposition. Experiments using the MIR-1K dataset show that the proposed methods yield significantly better separation performance than the conventional RPCA.

0 Replies