Abstract: This paper presents a ferroelectric FET (FeFET)-based processing-in-memory (PIM) architecture to accelerate the inference of deep neural networks (DNNs). We propose a digital in-memory vector-matrix multiplication (VMM) engine design utilizing the FeFET crossbar to enable bit-parallel computation and eliminate analog-to-digital conversion in prior mixed-signal PIM designs. A dedicated hierarchical network-on-chip (H-NoC) is developed for input broadcasting and on-the-fly partial results processing, reducing the data transmission volume and latency. Simulations in 28-nm CMOS technology show $115\times $ and $6.3\times $ higher computing efficiency (GOPs/W) over desktop GPU (Nvidia GTX 1080Ti) and resistive random access memory (ReRAM)-based design, respectively.
External IDs:doi:10.1109/jxcdc.2019.2923745
Loading