GQSA: Group Quantization and Sparsity for Accelerating Large Language Model Inference

Chao Zeng, Songwei Liu, Shu Yang, Fangmin Chen, Xing Mei, Lean Fu

Published: 2024, Last Modified: 29 Mar 2026CoRR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Loading