Hybrid Pruning: Thinner Sparse Networks for Fast Inference on Edge Devices

Xiaofan Xu; Mi Sun Park; Cormac Brick

Hybrid Pruning: Thinner Sparse Networks for Fast Inference on Edge Devices

Xiaofan Xu, Mi Sun Park, Cormac Brick

Published: 07 Nov 2018, Last Modified: 05 May 2023NIPS 2018 Workshop CDNNRIA Blind SubmissionReaders: Everyone

Abstract: We introduce hybrid pruning which combines both coarse-grained channel and fine-grained weight pruning to reduce model size, computation and power demands with no to little loss in accuracy for enabling modern networks deployment on resource-constrained devices, such as always-on security cameras and drones. Additionally, to effectively perform channel pruning, we propose a fast sensitivity test that helps us quickly identify the sensitivity of within and across layers of a network to the output accuracy for target MACs or accuracy tolerance. Our experiment shows significantly better results on ResNet50 on ImageNet compared to existing work, even with an additional constraint of channels be hardware-friendly number.

Keywords: Network Pruning, Channel Pruning, Weight Pruning, Sensitivity Test

8 Replies

Loading