BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations

Anonymous

Sep 25, 2019 Blind Submission readers: everyone Show Bibtex
  • Abstract: Binary Neural Network (BNN) has been gaining interest thanks to its computing cost reduction and memory saving. However, BNN suffers from performance degradation mainly due to the gradient mismatch caused by binarizing activations. Previous works tried to address the gradient mismatch by reducing the discrepancy between activation functions used at forward and backward passes, which is an indirect measure. In this work, we introduce coordinate discrete gradient (CDG) to better estimate the gradient mismatch. Analysis using the CDG indicates that using higher precision for activation is more effective than modifying the backward pass of binary activation function. Based on the observation, we propose a new training scheme for binary activation network called BinaryDuo in which two binary activations are coupled into a ternary activation during training. Experimental results show that BinaryDuo outperforms state-of-the-art BNNs on various benchmarks with the same amount of parameters and computing cost.
  • Code: https://drive.google.com/open?id=1NxZdaSB7gZPMVH35hqp1xaqZ7ilwVAtD
0 Replies

Loading