DeepTrain: A Programmable Embedded Platform for Training Deep Neural Networks

Duckhwan Kim, Taesik Na, Sudhakar Yalamanchili, Saibal Mukhopadhyay

Published: 2018, Last Modified: 07 Mar 2025IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 2018EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This paper presents, DeepTrain, an embedded platform for high-performance and energy-efficient training of deep neural network (DNN). The key architectural concept of DeepTrain is to develop a spatially homogeneous computing (and memory) fabric with temporally heterogeneous programmable data flows to optimize memory mapping and data reuse during different phases of training operation.The DeepTrain is demonstrated as an in-memory accelerator integrated in the logic layer of a 3-D memory module. A programming model and supporting architecture utilizes the flexible data flow to efficiently accelerate training of various types of DNNs. The cycle level simulation and synthesized design in 15 nm FinFET shows power efficiency of 500 GFLOPS/W, and almost similar throughput for a wide range of DNNs, including convolutional, recurrent, and mixed (CNN+RNN) networks.