BASE Layers: Simplifying Training of Large, Sparse ModelsDownload PDFOpen Website

2021 (modified: 08 Nov 2021)ICML 2021Readers: Everyone
Abstract: We introduce a new balanced assignment of experts (BASE) layer for large language models that greatly simplifies existing high capacity sparse layers. Sparse layers can dramatically improve the eff...
0 Replies

Loading