Abstract: Machine learning has achieved remarkable success in various domains. However, the computational demands and memory requirements of these models pose challenges for deployment on privacy-secured or wearable edge devices. To address this issue, we propose an area-power-efficient multiplier-less processing element (PE) in this paper. Prior to implementing the proposed PE, we apply a power-of-2 dictionary-based quantization to the model. We analyze the effectiveness of this quantization method in preserving the accuracy of the original model and present the standard and a specialized diagram illustrating the schematics of the proposed PE. Our evaluation results demonstrate that our design achieves approximately 30% lower power consumption and 35% smaller core area compared to a conventional multiplication-and-accumulation (MAC) PE. Moreover, the applied quantization reduces the model size and operand bit-width, resulting in reduced on-chip memory usage and energy consumption for memory accesses.
Loading