Toggle navigation
OpenReview
.net
Login
×
Go to
CORR 2023
homepage
High-throughput Generative Inference of Large Language Models with a Single GPU
Ying Sheng
,
Lianmin Zheng
,
Binhang Yuan
,
Zhuohan Li
,
Max Ryabinin
,
Daniel Y. Fu
,
Zhiqiang Xie
,
Beidi Chen
,
Clark W. Barrett
,
Joseph E. Gonzalez
,
Percy Liang
,
Christopher Ré
,
Ion Stoica
,
Ce Zhang
2023 (modified: 14 Apr 2023)
CoRR 2023
Readers:
Everyone
0 Replies
Loading