Hardware-friendly Log-scale Quantization for CNNs with Activation Functions Containing Negative Values

Abstract: Recently, with the development of deep learning and GPU, various convolutional neural network (CNN)-based object detection studies are accelerating, and accordingly, research on network compression including quantization is being actively conducted. Due to the characteristics of CNN that accompanies multiple multiplication operations, it is difficult for the existing linear scale quantization method to maximize the benefits of network compression in the accelerator implementation process. On the other hand, log scale quantization has a good quantization effect when implementing an accelerator, but causes a relatively large decrease in accuracy. To address this problem, this paper proposes a technique that minimizes the accuracy degradation of CNNs due to quantization by re-scaling the distribution in consideration of the activation function with negative values in the log scale quantization process. As a result of the experiment, the proposed quantization technique can achieve an accuracy improvement of 1.6% compared to the existing log scale quantization, which can reduce hardware resources by more than 87.52% while maintaining a similar accuracy degradation compared to the existing linear scale quantization.
0 Replies
Loading