Nonlinear Denoising, Linear DemixingDownload PDF

Published: 18 Oct 2021, Last Modified: 05 May 2023ICBINB@NeurIPS2021 PosterReaders: Everyone
Keywords: nonlinear systems, denoising, combinatorial problem, superposition, non-negative decomposition
TL;DR: Polyphonic piano transcription as a two stage process, with nonlinear denoising and linear demixing.
Abstract: We cast the combinatorial problem of polyphonic piano transcription as a two stage process. A nonlinear denoising stage maps spectrogram representations of arbitrary piano music with unknown timbral characteristics onto a canonical spectrogram representation with known timbral characteristics. A subsequent linear demixing stage aims to exploit the knowledge about the canonical timbral characteristics. The idea behind this two stage process is to try to elegantly sidestep any musical bias inherent in the training dataset that is easily picked up by a single stage, nonlinear (neural) transcription system (with large capacity). The two stage process tries not to force the nonlinear system to solve a combinatorial problem, which is more amenable to being solved by a linear decomposition method that has the superposition property. Using the simplest setup we could think of, we obtain (rather mixed (pun intended)) results on a standard polyphonic piano transcription dataset — the two stage process still suffers from generalization problems after the first stage, which the second stage is unable to compensate.
Category: Negative result: I would like to share my insights and negative results on this topic with the community
1 Reply

Loading