Prism: Unleashing GPU Sharing for Cost-Efficient Multi-LLM Serving

Shan Yu, Jiarong Xing, Yifan Qiao, Mingyuan Ma, Yangmin Li, Yang Wang, Shuo Yang, Zhiqiang Xie, Shiyi Cao, Ke Bao, Ion Stoica, Harry Xu, Ying Sheng

Published: 2025, Last Modified: 07 May 2026CoRR 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Loading