Keywords: spiking neural networks, multimodality, convLSTM, semantic music generation
TL;DR: A shallow spiking convLSTM has been proposed. The model trained on custom 2D data representing sequential semantic music data gives the best training results.
Abstract: Spiking neural networks received a lot of attention in a scientific literature due to low memory and energy con-sumption, which makes them suitable for edge computing and off-line deployment of AI models on low power de-vices. The use of spiking neural networks in a multimodal generative context was not yet explored and in this paper the focus in made on comparison of performance between classical and spiking versions of shallow Convolutional Long-Short Term Memory (ConvLSTM) network archi-tecture. A new encoding strategy based on the system of graphs allowed dimensionality and data augmentation of one-dimensional sequential data, resulting in better gen-eralization capacities of trained models. To the best of our knowledge, this is the first attempt of implementing spik-ing ConvLSTM model for semantic music generation task. We release the resources with the article (https://github.com/asnota/spiking_convlstms).
Submission Type: archival
Presentation Type: online
Presenter: Anna Shvets