Quantized Back-Propagation: Training Binarized Neural Networks with Quantized Gradients

Itay Hubara, Elad Hoffer, Daniel Soudry

Feb 12, 2018 (modified: Feb 12, 2018) ICLR 2018 Workshop Submission readers: everyone
  • Abstract: Binarized Neural networks (BNNs) have been shown to be effective in improving network efficiency during the inference phase, after the network has been trained. However, BNNs only binarize the model parameters and activations during propagations. We show there is no inherent difficulty in training BNNs using "Quantized BackPropagation" (QBP), in which we also quantized the error gradients and in the extreme case ternarize them. To avoid significant degradation in test accuracy, we apply stochastic ternarization and increase the number of filter maps in a each convolution layer. Using QBP has the potential to significantly improve the execution efficiency (\emph{e.g.}, reduce dynamic memory footprint and computational energy and speed up the training process, even after such an increase in network size.
  • TL;DR: By quantizing only the sequential error gradients we can accelerate the DNNs training while receiving high accuracy results.
  • Keywords: Neural Network Acceleration, Neural Network Compression