DE-C3: Dynamic Energy-Aware Compression for Computing-In-Memory-Based Convolutional Neural Network Acceleration

Guan-Wei Wu, Cheng-Yang Chang, An-Yeu Andy Wu

Published: 01 Jan 2023, Last Modified: 21 Oct 2023SOCC 2023Readers: Everyone

Abstract: Convolutional neural networks (CNNs) are leveraged in many applications, such as image classification and natural language processing (NLP) tasks. However, the hardware implementation of CNNs not only occupies a considerable amount of memory space but also incurs high computational demands, which can lead to significant energy cost for data transfer. Therefore, an emerging computing-in-memory (CIM)- based architecture has been proposed to alleviate the energy bottleneck of CNNs. Various model compression methods have recently been studied to enhance energy efficiency of the CIM-based accelerator, including quantization and pruning. However, these works are separately discussed with the static inference scenario. In this paper, the proposed DE-C3 can dynamically and jointly design the compression strategy. Besides, it adopts trainable energy-aware thresholds for both quantization and pruning scenarios. Experimental results based on the CIFAR-10 dataset show that our DE-C3 achieves up to 3.4× the energy reduction compared to state-of-the-art works.

0 Replies