Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Fixed Point Quantization of Deep Convolutional Networks
Darryl D. Lin, Sachin S. Talathi, V. Sreekanth Annapureddy
Feb 18, 2016 (modified: Feb 18, 2016)ICLR 2016 workshop submissionreaders: everyone
Abstract:In recent years increasingly complex architectures for deep convolution networks (DCNs) have been proposed to boost the performance on image recognition tasks. However, the gains in performance have come at a cost of substantial increase in computation and model storage resources. Fixed point implementation of DCNs has the potential to alleviate some of these complexities and facilitate potential deployment on embedded hardware. In this paper, we formulate and solve an optimization problem to identify the optimal fixed point bit-width allocation across layers to enable efficient fixed point implementation of DCNs. Our experiments show that in comparison to equal bit-width settings, optimized bit-width allocation offers >20% reduction in model size without any loss in accuracy on CIFAR-10 benchmark. We also demonstrate that fine-tuning can further enhance the accuracy of fixed point DCNs beyond that of the original floating point model. In doing so, we report a new state-of-the-art fixed point performance of 6.78% error-rate on CIFAR-10 benchmark.
Enter your feedback below and we'll get back to you as soon as possible.