Toggle navigation
OpenReview
.net
Login
×
Go to
DBLP
homepage
ServerlessLLM: Low-Latency Serverless Inference for Large Language Models
Yao Fu
,
Leyang Xue
,
Yeqi Huang
,
Andrei-Octavian Brabete
,
Dmitrii Ustiugov
,
Yuvraj Patel
,
Luo Mai
Published: 01 Jan 2024, Last Modified: 12 May 2025
OSDI 2024
Everyone
Revisions
BibTeX
CC BY-SA 4.0
Loading