Speeding up Convolutional Neural Network Training with Dynamic Precision Scaling and Flexible Multiplier-Accumulator

Published: 2016, Last Modified: 07 Mar 2025ISLPED 2016EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Training convolutional neural network is a major bottleneck when developing a new neural network topology. This paper presents a dynamic precision scaling (DPS) algorithm and flexible multiplier-accumulator (MAC) to speed up convolutional neural network training. The DPS algorithm utilizes dynamic fixed point and finds good enough numerical precision for target network while training. The precision information from DPS is used to configure our proposed MAC. The proposed MAC can perform fixed point computation with variable precision mode providing differentiated computation time which enables speeding up training for lower precision computation. Simulation results show that our work can achieve 5.7x speed-up while consuming 31% energy compared to baseline for modified Alexnet on Flickr image style recognition task.
Loading