Toggle navigation
OpenReview
.net
Login
×
Go to
DBLP
homepage
SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs
Ruibo Fan
,
Xiangrui Yu
,
Peijie Dong
,
Zeyu Li
,
Gu Gong
,
Qiang Wang
,
Wei Wang
,
Xiaowen Chu
Published: 01 Jan 2025, Last Modified: 04 May 2025
EuroSys 2025
Everyone
Revisions
BibTeX
CC BY-SA 4.0
Loading