Distributed Accumulation based Energy Efficient STT-MRAM based Digital PIM Architecture

Published: 01 Jan 2022, Last Modified: 26 May 2025ISOCC 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Spin transfer torque MRAM (STT-MRAM) based digital processing-in-memory (PIM) has been recently proposed for energy-efficient processing of convolutional neural network (CNN) without analog-to-digital converters (ADC). However, since only computations between operands stored in the same row are possible in the digital PIM, it consumes considerable energy for data transfer between memory rows when computations between data stored in different rows are performed. In this paper, we present energy-efficient digital PIM architecture that supports the distributed accumulation scheme. In the proposed PIM architecture, the computations for accumulation are distributed to the memory arrays and peripheral circuits, and only a portion of the workload for the accumulations which does not require heavy data transfer cost is performed in the memory array. Then, the remaining of the accumulations that needs complicated data transfer among the memory array are performed using the peripheral circuits. In addition, as the value of the partial sums is read before accumulation, the zero-skipping technique, which cannot be applied when the computations are fully performed in the memory array, can be also applied in the proposed PIM architecture. The simulations with 28nm CMOS process show that the proposed digital PIM architecture with zero prediction achieves the energy savings up to 54.3% over the conventional digital PIMs.
Loading