GShard: Scaling Giant Models with Conditional Computation and Automatic ShardingDownload PDF

28 Sept 2020, 15:48 (edited 10 Feb 2022, 11:45)ICLR 2021 PosterReaders: Everyone
Abstract:
Code Of Ethics:
Code:
10 Replies

Loading