Abstract: Ultra low-precision (<; 8-bit width) arithmetic is a discernible approach to deploy deep learning networks on to edge devices. Recent findings show that posit with linear quantization has similar dynamic range as the weight and activation values across the deep neural network layers. This characteristic can benefit the data representation of deep neural networks without impacting the overall accuracy. When capturing the full dynamic range of weights and activations, posit with mixed precision or linear quantization leads to a surge in hardware resource requirements. We propose adaptive posit, which has the ability to capture the non-homogeneous dynamic range of weights and activation's across the deep neural network layers. A fine granular control is achieved by embedding the hyper-parameters in the numerical format. To evaluate the overall system efficiency, we design a parameterized ASIC soft core for the adaptive posit encoder and decoder. Benchmarking and evaluation of the adaptive posit is performed on three datasets: Fashion-MNIST, CIFAR-10, and ImageNet. Results assert that on average the performance on inference with <; 8-bit adaptive posits surpasses (2% to 10%) that of posit.
0 Replies
Loading