Linear Recurrences Accessible to Everyone

Published: 23 Jan 2025, Last Modified: 31 Mar 2025ICLR 2025 Blogpost TrackEveryoneRevisionsBibTeXCC BY 4.0
Blogpost Url: https://d2jud02ci9yv69.cloudfront.net/2025-04-28-linrec-71/blog/linrec/
Abstract: Investigating linear RNNs such as Mamba, can be challenging because they are currently not efficiently expressible in PyTorch. We propose the abstraction of linear recurrences to gain intuition for the computational structure of these emerging deep learning architectures. After deriving their parallel algorithm, we gradually build towards a simple template CUDA extension for PyTorch. We hope that making linear recurrences accessible to a wider audience inspires further research on linear-time sequence mixing.
Conflict Of Interest: My PhD supervisor was the first-author of one of the cited papers during his PhD. The contribution of the paper and this blog are very different, and I don't believe that this can be considered advertisement. I intend to release the [template CUDA extension for PyTorch](https://anonymous.4open.science/r/linrec-2F7F/) as a pip package. It could then be perceived as a competitor similar PyTorch extensions: - [github.com/proger/accelerated-scan](https://github.com/proger/accelerated-scan) - [github.com/alxndrTL/mamba.py](https://github.com/alxndrTL/mamba.py/blob/main/mambapy/pscan.py) - [github.com/johnryan465/pscan](https://github.com/johnryan465/pscan)
Submission Number: 103
Loading