A FeFET-Based Compute-in-Memory Architecture on FPGA for Neural Network Inference

Published: 2025, Last Modified: 17 Sept 2025FCCM 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Implementing compute-in-memory (CIM) architectures on FPGA offers an effective solution to the von Neumann bottleneck by enabling fast configuration and computation directly within memory. Traditional custom solutions rely on the modification of block RAM (BRAM) to implement memory computing. However, single-word-line activation of BRAM results in low parallelism, and the need for additional adder trees to accumulate partial sums further limits efficiency. To overcome these limitations, we propose a CIM core based on a 2T1C structure as a replacement for BRAM units. This core utilizes a charge redistribution mechanism and reuse of ADC capacitors, achieving high parallelism, low power consumption, and a compact area. By incorporating computational capabilities within a single cell, our design enables dual parallelism, further enhancing performance and efficiency. In addition, we present an automated deployment and mapping tool for deep neural networks (DNNs) on FPGA, allowing users to rapidly develop FPGA-based solutions for different network architectures. Compared to state-of-the-art solutions, our design achieves a peak throughput improvement of 4.5× and a reduction in area by 53%.
Loading