Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling

Harry Jake Cunningham; Giorgio Giannone; Mingtian Zhang; Marc Peter Deisenroth

Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling

Harry Jake Cunningham, Giorgio Giannone, Mingtian Zhang, Marc Peter Deisenroth

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Sequence Modelling, Convolutions, Structural Reparameterization, Self Attention, State Space Models

Abstract: Global convolutions have shown increasing promise as powerful general-purpose sequence models. However, training long convolutions is challenging, and kernel parameterizations must be able to learn long-range dependencies without overfitting. This work introduces reparameterized multi-resolution convolutions ($\texttt{MRConv}$), a novel approach to parameterizing global convolutional kernels for long-sequence modeling. By leveraging multi-resolution convolutions, incorporating structural reparameterization and introducing learnable kernel decay, $\texttt{MRConv}$ learns expressive long-range kernels that perform well across various data modalities. Our experiments demonstrate state-of-the-art performance on the Long Range Arena, Sequential CIFAR, and Speech Commands tasks among convolution models and linear-time transformers. Moreover, we report improved performance on ImageNet classification by replacing 2D convolutions with 1D $\texttt{MRConv}$ layers.

Supplementary Material: zip

Primary Area: Deep learning architectures

Submission Number: 9552

Loading