Abstract: As the DNN model becomes more complex, the number of parameters constituting the model increases and requires a large amount of computation. Recently, a quantization technique that reduces the memory of the model and enables efficient computation has been studied. In this paper, we propose Fake Single Precision Training (FST) to increase accuracy by using a high bit range for weight and a low bit range for activation output with a certain probability. FST improved the accuracy of the model by applying the features of Google's Quantization Aware Training and FaceBook's Quant Noise method.
Loading