2021 (modified: 08 Nov 2021)ICML 2021Readers: Everyone
Abstract:We introduce a new balanced assignment of experts (BASE) layer for large language models that greatly simplifies existing high capacity sparse layers. Sparse layers can dramatically improve the eff...