Abstract: Automatic melody generation for pop music has been a long-time aspiration for
both AI researchers and musicians. However, learning to generate euphonious
melody has turned out to be highly challenging due to a number of factors. Representation
of multivariate property of notes has been one of the primary challenges.
It is also difficult to remain in the permissible spectrum of musical variety, outside
of which would be perceived as a plain random play without auditory pleasantness.
Observing the conventional structure of pop music poses further challenges.
In this paper, we propose to represent each note and its properties as a unique
‘word,’ thus lessening the prospect of misalignments between the properties, as
well as reducing the complexity of learning. We also enforce regularization policies
on the range of notes, thus encouraging the generated melody to stay close
to what humans would find easy to follow. Furthermore, we generate melody
conditioned on song part information, thus replicating the overall structure of a
full song. Experimental results demonstrate that our model can generate auditorily
pleasant songs that are more indistinguishable from human-written ones than
previous models.
TL;DR: We propose a novel model to represent notes and their properties, which can enhance the automatic melody generation.
Keywords: music, lstm, gan, generation, rnn, hmm
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:1710.11549/code)
Withdrawal: Confirmed
0 Replies
Loading