Mist: Efficient Distributed Training of Large Language Models via Memory-Parallelism Co-Optimization

Published: 01 Jan 2025, Last Modified: 23 Apr 2025EuroSys 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Loading