Published: 01 Jan 2020, Last Modified: 13 May 2023ICML 2020Readers: Everyone
Abstract:The Transformer architecture has achieved considerable success recently; the key component of the Transformer is the attention layer that enables the model to focus on important regions within an i...