KDEformer: Accelerating Transformers via Kernel Density EstimationDownload PDFOpen Website

Published: 2023, Last Modified: 22 Jan 2024ICML 2023Readers: Everyone
Abstract: Dot-product attention mechanism plays a crucial role in modern deep architectures (e.g., Transformer) for sequence modeling, however, naïve exact computation of this model incurs quadratic time and...
0 Replies

Loading