Abstract: Deep learning is becoming more widespread in its application due to its power in solving complex classification problems. However, deep learning models often require large memory and energy consumption, which may prevent them from being deployed effectively on embedded platforms, limiting their applications. This work addresses the problem by proposing methods {\em Weight Reduction Quantisation} for compressing the memory footprint of the models, including reducing the number of weights and the number of bits to store each weight. Beside, applying with sparsity-inducing regularization, our work focuses on speeding up stochastic variance reduced gradients (SVRG) optimization on non-convex problem. Our method that mini-batch SVRG with $\ell$1 regularization on non-convex problem has faster and smoother convergence rates than SGD by using adaptive learning rates. Experimental evaluation of our approach uses MNIST and CIFAR-10 datasets on LeNet-300-100 and LeNet-5 models, showing our approach can reduce the memory requirements both in the convolutional and fully connected layers by up to 60$\times$ without affecting their test accuracy.
TL;DR: Compression of Deep neural networks deployed on embedded device.
Keywords: Sparse representation, Compression Deep Learning Models, L1 regularisation, Optimisation.
Data: [MNIST](https://paperswithcode.com/dataset/mnist)
7 Replies
Loading