- Abstract: Modern neural networks often require deep compositions of high-dimensional nonlinear functions (wide architecture) to achieve high test accuracy, and thus can have overwhelming number of parameters. Repeated high cost in prediction at test-time makes neural networks ill-suited for devices with constrained memory or computational power. We introduce an efficient mechanism, reshaped tensor decomposition, to compress neural networks by exploiting three types of invariant structures: periodicity, modulation and low rank. Our reshaped tensor decomposition method exploits such invariance structures using a technique called tensorization (reshaping the layers into higher-order tensors) combined with higher order tensor decompositions on top of the tensorized layers. Our compression method improves low rank approximation methods and can be incorporated to (is complementary to) most of the existing compression methods for neural networks to achieve better compression. Experiments on LeNet-5 (MNIST), ResNet-32 (CI- FAR10) and ResNet-50 (ImageNet) demonstrate that our reshaped tensor decomposition outperforms (5% test accuracy improvement universally on CIFAR10) the state-of-the-art low-rank approximation techniques under same compression rate, besides achieving orders of magnitude faster convergence rates.
- Keywords: Neural Network Compression, Low Rank Approximation, Higher Order Tensor Decomposition
- TL;DR: Compression of neural networks which improves the state-of-the-art low rank approximation techniques and is complementary to most of other compression techniques.