Abstract: With the deployment of neural networks on mobile devices and the necessity of transmitting neural networks over limited or expensive channels, the file size of trained model was identified as bottleneck. We propose a codec for the compression
of neural networks which is based on transform coding for convolutional and dense layers and on clustering for biases and normalizations. With this codec, we achieve average compression factors between 7.9–9.3 while the accuracy of the compressed networks for image classification decreases only by 1%–2%, respectively.
Keywords: neural network compression, transform coding, clustering, codec
TL;DR: Our neural network codec (which is based on transform coding and clustering) enables a low complexity and high efficient transparent compression of neural networks.
6 Replies
Loading