Transformers are RNNs: Fast Autoregressive Transformers with Linear AttentionDownload PDFOpen Website

2020 (modified: 17 Nov 2022)ICML 2020Readers: Everyone
Abstract: Transformers achieve remarkable performance in several tasks but due to their quadratic complexity, with respect to the input’s length, they are prohibitively slow for very long sequences. To addre...
0 Replies

Loading