TL;DR: This paper studies structured sparse training of CNNs that leads to fixed, sparse weight matrices after a set number of epochs.
Abstract: This paper studies structured sparse training of CNNs with a gradual pruning technique that leads to fixed, sparse weight matrices after a set number of epochs. We simplify the structure of the enforced sparsity so that it reduces overhead caused by regularization. The proposed training methodology explores several options for structured sparsity.
We study various tradeoffs with respect to pruning duration, learning-rate configuration, and the total length of training.
We show that our method creates a sparse version of ResNet50 and ResNet50v1.5 on full ImageNet while remaining within a negligible <1% margin of accuracy loss. To make sure that this type of sparse training does not harm the robustness of the network, we also demonstrate how the network behaves in the presence of adversarial attacks. Our results show that with 70% target sparsity, over 75% top-1 accuracy is achievable.
Keywords: Structured Sparsity, Sparsity, Training, Compression, Adversarial, Regularization, Acceleration
Original Pdf: pdf
8 Replies
Loading