Keywords: convolutional neural network, block-wise separable convolution, network architecture search
Abstract: Convolutional neural networks (CNNs) have demonstrated great capability of solving various computer vision tasks with nice prediction performance. Nevertheless, the higher accuracy often comes with an increasing number of model parameters and large computational cost. This raises challenges in deploying them in resource-limited devices. In this paper, we introduce block-wise separable convolutions (BlkSConv) to replace the standard convolutions in order to compress deep CNN models. First, BlkSConv expresses the standard convolutional kernel as an ordered set of block vectors each of which is a linear combination of fixed basis block vectors. Then it eliminates most basis block vectors and their corresponding coefficients to obtain an approximated convolutional kernel. Moreover, the proposed BlkSConv operation can be efficiently realized via a combination of pointwise and group-wise convolutions. Thus the constructed networks have smaller model size and fewer multiply-adds operations while keeping comparable prediction accuracy. However, it is unknown how to search a qualified hyperparameter setting of the block depth and number of basis block vectors. To address this problem, we develop a hyperparameter search framework based on principal component analysis (PCA) to help determine these two hyperparameters such that the corresponding network achieves nice prediction performance while simultaneously satisfying the constraints of model size and model efficiency. Experimental results demonstrate the prediction performance of constructed BlkSConv-based CNNs where several convolutional layers are replaced by BlkSConv layers suggested by the proposed PCA-based hyperparameter search algorithm. Our results show that BlkSConv-based CNNs achieve competitive performance compared with the standard convolutional models for the datasets including ImageNet, CIFAR-10/100, Stanford Dogs, and Oxford Flowers.
TL;DR: The proposed BlkSConv can approximate the standard convolution in various ways. Given a trained model, our developed HSA can find a corresponding BlkSConv-based model with fewer parameters and MAdds while preserving comparable performance.
Supplementary Material: zip
11 Replies
Loading