Capturing Musical Structure Using Convolutional Recurrent Latent Variable Model

Eunjeong Koh, Dustin Wright, Shlomo Dubnov

Feb 12, 2018 (modified: Feb 12, 2018) ICLR 2018 Workshop Submission readers: everyone
  • Abstract: In this paper, we present a model for learning musical features and generating novel sequences of music. Our model, the Convolutional-Recurrent Variational Autoencoder (C-RVAE), captures short-term polyphonic sequential musical structure using a Convolutional Neural Network as a front-end. To generate sequential data, we apply the recurrent latent variational model, which uses an encoder-decoder architecture with latent probabilistic connections to capture the hidden structure of music. Using the sequence-to-sequence model, our generative model can exploit samples from a prior distribution and generate a longer sequence of music.
  • TL;DR: We explore the Convolutional-Recurrent Variational Autoencoder (C-RVAE), which is an effective method of learning useful musical features that we use for polyphonic music generation.
  • Keywords: variational autoencoder, convolutional neural networks, recurrent neural networks, deep learning for music