SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-EfficientOpen Website

2023 (modified: 14 Apr 2023)CoRR 2023Readers: Everyone
0 Replies

Loading