Abstract: Large neural networks have aroused impressive progress in various real world applications. However, the expensive storage and computational resources requirement for running deep networks make them problematic to be deployed on mobile devices. Recently, matrix and tensor decompositions have been employed for compressing neural networks. In this paper, we develop a simultaneous tensor decomposition technique for network optimization. The shared network structure is first discussed. Sometimes, not only the structure but also the parameters are shared to form a compressed model at the expense of degraded performance. This indicates that the weight tensors between layers within one network contain both identical components and independent components. To utilize this characteristic, two new coupled tensor train decompositions are developed for fully and partly structure sharing cases, and an alternating optimization approach is proposed for low rank tensor computation. Finally, we restore the performance of the neural network model by fine-tuning. The compression ratio of the devised approach can then be calculated. Experimental results are also included to demonstrate the benefits of our algorithm for both applications of image reconstruction and classification, using the well known datasets such as Cifar-10/Cifar-100 and ImageNet and widely used networks such as ResNet. Comparing to the state-of-the-art independent matrix and tensor decomposition based methods, our model can obtain a better network performance under the same compression ratio.
Loading