Abstract: Deep convolutional neural networks (CNNs) have achieved impressive performance in many computer vision tasks. However, their large model sizes require heavy computational resources, making pruning redundant filters from existing pre-trained CNNs an essential task in developing efficient models for resource-constrained devices. Whole-network filter pruning algorithms prune varying fractions of filters from each layer, hence providing greater flexibility. State-of-the-art whole-network pruning methods are either computationally expensive due to the need to calculate the loss for each pruned filter using a training dataset, or use various heuristic / learned criteria for determining the pruning fractions for each layer. Hence there is a need for a simple and efficient technique for whole network pruning. This paper proposes
a two-level hierarchical approach for whole-network filter pruning which is efficient and uses the classification loss as the final criterion. The lower-level algorithm (called filter-pruning) uses a sparse-approximation formulation based on linear approximation of filter weights. We explore two algorithms: orthogonal matching pursuit-based greedy selection and a greedy backward pruning approach. The backward pruning algorithm uses a novel closed-form error criterion for efficiently selecting the optimal filter at each stage, thus making the whole algorithm much faster. The higher-level algorithm (called layer-selection) greedily selects the best-pruned layer (pruning using the filter-selection algorithm) using a global pruning criterion. We propose algorithms for two different global-pruning criteria: (1) layerwise-relative error (HBGS), and (2) final classification error (HBGTS). Our suite of algorithms outperforms state-of-the-art pruning methods on ResNet18, ResNet32, ResNet56, VGG16, and ResNext101. Our method reduces the RAM requirement for ResNext101 from 7.6 GB to 1.5 GB and achieves a 94% reduction in FLOPS without losing accuracy on CIFAR-10.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: 1. Added citations in the Related Work section to coreset-based pruning methods.
2. Clarified the time complexity of FP-OMP for uniform pruning in Section 3.2.
3. Changed $S'$ to $S^c$ in line 9 of Algorithm 1.
4. Changed $N$ to $|D|$ in line 10 of Algorithm 2 and line 7 of Algorithm 4.
5. Emphasized high parameter reduction scenarios in Section 4.2.
6. Added results for ResNet18/CIFAR100@95%, VGG16/CIFAR10@98%, ResNet56/CIFAR100@98%, and ResNet32/CIFAR100@98% in Table 1.
7. Included EarlyCrop-S time in Figure 3(b).
Video: https://www.youtube.com/watch?v=egk0ZJb89Ao
Code: https://github.com/kiranpurohit/Hierarchical_Filter_Pruning
Assigned Action Editor: ~Cedric_Archambeau1
Submission Number: 2362
Loading