Toggle navigation
OpenReview
.net
Login
×
Go to
DBLP
homepage
Flash-LLM: Enabling Low-Cost and Highly-Efficient Large Generative Model Inference With Unstructured Sparsity
Haojun Xia
,
Zhen Zheng
,
Yuchao Li
,
Donglin Zhuang
,
Zhongzhu Zhou
,
Xiafei Qiu
,
Yong Li
,
Wei Lin
,
Shuaiwen Leon Song
Published: 01 Jan 2023, Last Modified: 09 May 2025
Proc. VLDB Endow. 2023
Everyone
Revisions
BibTeX
CC BY-SA 4.0
Loading