Training Neural Nets to Achieve Audio-to-Score Translation: Opening the Black-BoxDownload PDF


Oct 22, 2018 (edited Sep 10, 2019)NIPS 2018 Workshop IRASL Blind SubmissionReaders: Everyone
  • Abstract: It is suggested that the task of audio-to-score translation offers an adequate testbed to investigate the division of labor between background knowledge and machine learning in the domain of audio pattern recognition, with a controllable level of difficulty and the ability to synthesize a limitless amount of labelled data. As a proof of concept, this paper focuses on pitch detection from audio signals. Extensive background knowledge is used to initialize simple convolutional neural nets (NN) and achieve the recognition of single notes with a decent accuracy. The performance achieved by trained NNs, however, is significantly higher. Some tentative interpretations of this fact are obtained by opening the black box and inspecting the modifications of the NN filters due to supervised learning.
  • Keywords: Pitch detection, automatic music transcription, convolutional neural networks, prior knowledge, raw signal
3 Replies