A 28 nm 16-kb Sign-Extension-Less Digital-Compute-in-Memory Macro With Extension-Friendly Compute Units and Accuracy-Adjustable Adder-Tree
Abstract: Conventional digital-domain SRAM compute-in-memory (CIM) faces challenges in handling multiply-and-accumulate (MAC) operations with signed values, either in serial data feeding mode or extra sign-bit processing. The proposed CIM macro has the following features: 1) a sign-extension-less array multiplication circuit structure that eliminates the need for converting partial sums into 2’s complement, which removes the constraints related to handling specific symbol bits; 2) developing a circuit that avoids signed bit extension shift and accumulate, resulting in reduced area cost; and 3) integrating an adder structure that provides adjustable accuracy, thereby enhancing network adaptability as compared to traditional approximation techniques. A fabricated 28 nm 16-kb sign-extension-less DCIM was tested with the highest MAC speed with 5.6 ns (Signed 8 b IN&W 23 b Out) and achieved the best energy efficiency with 40.15 TOPS/W over a wide range of network adaptability.
Loading