SpectroBank: A filter-bank convolutional layer for CNN-based audio applications

Helena Peic Tukuljac; Benjamin Ricaud; Nicolas Aspert; Pierre Vandergheynst

SpectroBank: A filter-bank convolutional layer for CNN-based audio applications

Helena Peic Tukuljac, Benjamin Ricaud, Nicolas Aspert, Pierre Vandergheynst

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Keywords: audio, classification, convolutional neural network, deep learning, filter, filter-bank, raw waveform

TL;DR: A new convolution layer where the kernels are based on audio signal processing filters with few learnable parameters.

Abstract: We propose and investigate the design of a new convolutional layer where kernels are parameterized functions. This layer aims at being the input layer of convolutional neural networks for audio applications. The kernels are defined as functions having a band-pass filter shape, with a limited number of trainable parameters. We show that networks having such an input layer can achieve state-of-the-art accuracy on several audio classification tasks. This approach, while reducing the number of weights to be trained along with network training time, enables larger kernel sizes, an advantage for audio applications. Furthermore, the learned filters bring additional interpretability and a better understanding of the data properties exploited by the network.

Code: https://app.box.com/s/vh5u7mpwrllhuqr8yl9jobohjrrjw797

Original Pdf: pdf

12 Replies

Loading