Abstract: To enable efficient computation for convolutional neural networks, in-memory-computing (IMC) is proposed to perform computation within memory. However, the non-ideality significantly degrades the accuracy of IMC. In this work, we leverage a hybrid near/in-memory-computing architecture (NIMC) that allocates sensitive weights to error-free NMC and computes remained weights with high-efficient IMC. We further propose a Budget-based Workload Allocation for NIMC (BWA-NIMC). Specifically, we consider the resource difference between NMC and IMC to effectively allocate workloads under a targeted resource budget. Simulation results show that BWA-NIMC improves the accuracy by 18.38-48.54% under limited budgets (e.g., energy and latency) compared with prior works.
0 Replies
Loading