Abstract: While nonnegative matrix factorization (NMF) has successfully been applied for gain-robust multi-pitch detection, a method to track pitch values over time was not provided. We embed NMF-based pitch detection into a recently proposed pitch-tracking system, based on a factorial hidden Markov model (FHMM). The original system models speech spectra with Gaussian mixture models, which is sensitive to a gain mismatch between training and test data. We therefore combine the advantages of these two approaches and derive a gain-adaptive observation model for the FHMM. As training algorithm we use a modification of ℓ <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sup> -sparse NMF, which represents the short-time spectrum with scalable basis vectors. In experiments we show that the new approach significantly increases the gain-robustness of the original tracking system.
0 Replies
Loading