Abstract: Neural network pruning and quantization are two major lines of net- work compression. This raises a natural question that whether we can find the optimal compression by considering multiple network compression criteria in a unified framework. This paper incorpo- rates two criteria and seeks layer-wise compression by leveraging the meta-learning framework. A regularization loss is applied to unify the constraint of input and output channel numbers, bit-width of network activations and weights, so that the compressed net- work can satisfy a given Bit-OPerations counts (BOPs) constraint. We further propose an iterative compression constraint for optimiz- ing the compression procedure, which effectively achieves a high compression rate and maintains the original network performance. Extensive experiments on various networks and vision tasks show that the proposed method yields better performance and compres- sion rates than recent methods. For instance, our method achieves better image classification accuracy and compactness than the re- cent DJPQ. It achieves similar performance with the recent DHP in image super-resolution, meanwhile saves about 50% computation.
0 Replies
Loading