On Continuing DNN Accelerator Architecture Scaling Using Tightly Coupled Compute-on-Memory 3-D ICs

Gauthaman Murali; Aditya Iyer; Lingjun Zhu; Jianming Tong; Francisco Muñoz-Martínez; Srivatsa Rangachar Srinivasa; Tanay Karnik; Tushar Krishna; Sung Kyu Lim

On Continuing DNN Accelerator Architecture Scaling Using Tightly Coupled Compute-on-Memory 3-D ICs

Gauthaman Murali, Aditya Iyer, Lingjun Zhu, Jianming Tong, Francisco Muñoz-Martínez, Srivatsa Rangachar Srinivasa, Tanay Karnik, Tushar Krishna, Sung Kyu Lim

Published: 01 Jan 2023, Last Modified: 02 Mar 2025IEEE Trans. Very Large Scale Integr. Syst. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This work identifies the architectural and design scaling limits of 2-D flexible interconnect deep neural network (DNN) accelerators and addresses them with 3-D ICs. We demonstrate how scaling up a baseline 2-D accelerator in the $X/Y$ dimension fails and how vertical stacking effectively overcomes the failure. We designed multitier accelerators that are $1.67\times $ faster than the 2-D design. Using our 3-D architecture and circuit codesign methodology, we improve throughput, energy efficiency, and area efficiency by up to $5\times $ , $1.2\times $ , and $3.9\times $ , respectively, over 2-D counterparts. The IR-drop in our 3-D designs is within 10.7% of VDD, and the temperature variation is within 12 °C.

Loading