SPARK: co-exploring model SPArsity and low-RanKness for compact neural networks

Wanzhao Yang; Miao Yin; Yang Sui; Bo Yuan

SPARK: co-exploring model SPArsity and low-RanKness for compact neural networks

Wanzhao Yang, Miao Yin, Yang Sui, Bo Yuan

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone

Keywords: model compression, low-rankness, sparsity, tensor

Abstract: Sparsification and low-rank decomposition are two important techniques for deep neural network (DNN) compression. To date, these two popular yet distinct approaches are typically used in a separate way; while their efficient integration for better compression performance is little explored. In this paper we perform systematic co-exploration on the model sparsity and low-rankness towards compact neural networks. We first investigate and analyze several important design factors for the joint pruning and low-rank factorization, including operational sequence, low-rank format, and optimization objective. Based on the observations and outcomes from our analysis, we then propose SPARK, a unified DNN compression framework that can simultaneously capture model SPArsity and low-RanKness in an efficient way. Empirical experiments demonstrate very promising performance of our proposed solution. Notably, on CIFAR-10 dataset, our approach can bring 1.25%, 1.02% and 0.16% accuracy increase over the baseline ResNet-20, ResNet-56 and DenseNet-40 models, respectively, and meanwhile the storage and computational costs are reduced by 70.4% and 71.1% (for ResNet-20), 37.5% and 39.3% (for ResNet-56) and 52.4% and 61.3% (for DenseNet-40), respectively. On ImageNet dataset, our approach can enable 0.52% accuracy increase over baseline model with 48.7% fewer parameters.

One-sentence Summary: We propose a unified DNN compression framework SPARK that can simultaneously capture model sparsity and low-rankness and achieves better performance than uncompressed model with reduced storage and computational costs.

28 Replies

Loading