Published: 2023, Last Modified: 22 Jan 2024ICML 2023Readers: Everyone
Abstract:Dot-product attention mechanism plays a crucial role in modern deep architectures (e.g., Transformer) for sequence modeling, however, naïve exact computation of this model incurs quadratic time and...