Hardware-Friendly Logarithmic Quantization with Mixed-Precision for MobileNetV2

Dahun Choi, Hyun Kim

2022 (modified: 01 Feb 2023)AICAS 2022Readers: Everyone

Abstract: In a variety of computer vision applications, convolutional neural networks (CNNs) have achieved excellent accuracy. However, in order for a CNN to operate on embedded platforms such as mobile devices, hardware resources and power consumption must be reduced. Accordingly, research involving the application of low-precision quantization to lightweight networks, such as MobileNet, has attracted considerable attention. In particular, compared to linear quantization, logarithmic quantization can significantly reduce hardware resources by processing multiplication operations as addition operations when implementing a hardware accelerator. In this study, we propose a novel logarithmic weight quantization considering the characteristics of MobileNetV2, which is known to be notoriously difficult to quantize, and a mixed-precision quantization that minimizes accuracy loss by training the distribution range using the trainable parameter <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\alpha$</tex> , Experimental results show that the proposed method achieves accuracies greater than 1.47% and 2% on the CIFAR-10 and Tiny-ImageNet datasets, respectively, compared to the general log-scale quantization methods. As a result, the proposed method achieves a significant hardware resource reduction with only a slight degradation in performance when compared to the full precision (i.e., FP32), and achieves an additional power reduction effect of about 48% compared to linear scale quantization at the same precision.

0 Replies