Enabling Parallelism Hot Switching for Efficient Training of Large Language Models

Hao Ge, Fangcheng Fu, Haoyang Li, Xuanyu Wang, Sheng Lin, Yujie Wang, Xiaonan Nie, Hailin Zhang, Xupeng Miao, Bin Cui

Published: 04 Nov 2024, Last Modified: 25 Jan 2026CrossrefEveryoneRevisionsCC BY-SA 4.0
Loading