OpenReview
.net
OpenReview
.net
Login
OpenReview
.net
Login
Go to
OpenReview Public Article DBLP
homepage
Reducing Language Model Inference Latency using CPU-Assisted Serving
Theodoros Aslanidis
,
Sokol Kosta
,
Raffaele Montella
,
Spyros Lalis
,
Dimitris Chatzopoulos
Published: 2026, Last Modified: 02 May 2026
EuroMLSys@EuroSys 2026
Everyone
Revisions
BibTeX
CC BY-SA 4.0
External IDs:
dblp:conf/euromlsys/AslanidisKMLC26
Loading