Quantization training with two-level bit width

Han-Sung Kang; Yongjoo Lee; Dongbin Cho; Jaeyoung Lee; Mincheal Kang; Younghoon Kim; Jiwon Seo

Quantization training with two-level bit width

Han-Sung Kang, Yongjoo Lee, Dongbin Cho, Jaeyoung Lee, Mincheal Kang, Younghoon Kim, Jiwon Seo

Published: 01 Jan 2022, Last Modified: 04 Aug 2025ICEIC 2022EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: As the DNN model becomes more complex, the number of parameters constituting the model increases and requires a large amount of computation. Recently, a quantization technique that reduces the memory of the model and enables efficient computation has been studied. In this paper, we propose Fake Single Precision Training (FST) to increase accuracy by using a high bit range for weight and a low bit range for activation output with a certain probability. FST improved the accuracy of the model by applying the features of Google's Quantization Aware Training and FaceBook's Quant Noise method.

Loading