A Dynamic Approach to Accelerate Deep Learning Training

John Osorio; Adrià Armejach; Eric Petit; Marc Casas

A Dynamic Approach to Accelerate Deep Learning Training

John Osorio, Adrià Armejach, Eric Petit, Marc Casas

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Keywords: reduced precision, bfloat16, CNN, DNN, dynamic precision, mixed precision

TL;DR: Dynamic precision technique to train deep neural networks

Abstract: Mixed-precision arithmetic combining both single- and half-precision operands in the same operation have been successfully applied to train deep neural networks. Despite the advantages of mixed-precision arithmetic in terms of reducing the need for key resources like memory bandwidth or register file size, it has a limited capacity for diminishing computing costs and requires 32 bits to represent its output operands. This paper proposes two approaches to replace mixed-precision for half-precision arithmetic during a large portion of the training. The first approach achieves accuracy ratios slightly slower than the state-of-the-art by using half-precision arithmetic during more than 99% of training. The second approach reaches the same accuracy as the state-of-the-art by dynamically switching between half- and mixed-precision arithmetic during training. It uses half-precision during more than 94% of the training process. This paper is the first in demonstrating that half-precision can be used for a very large portion of DNNs training and still reach state-of-the-art accuracy.

Code: https://github.com/dynamicprec/dynamic

Original Pdf: pdf

8 Replies

Loading