Energy-efficient scheduling method with cross-loop model for resource-limited CNN accelerator designs

Abstract: The state-of-the-art customized accelerators of convolution neural networks (CNN) have achieved high throughput while the huge amount of data movements still remains as the dominant part of the total energy costs. In this paper, we propose an energy-efficient scheduling approach to find an efficient dataflow that minimizes data movements with limited hardware resource budgets. In detail, two-level nested loop transformations are proposed to separate memory and computing resource constraints. This allows us to fully exploit the potential of available memory resources for reducing off-chip memory traffic. Further, the proposed cross-loop model is capable of figuring out the data locality across nested loops in CNN algorithms. Finally, energy-delay production is employed as the evaluation criteria to balancing energy and throughput performance. The experimental results show our cross-loop model can reduce the off-chip data movements by 11-21% and achieve the theoretical optimum. Therefore, the proposed scheduling method can increase the energy efficiency by at least 8.7 times.
0 Replies
Loading