34.5 A 818-4094TOPS/W Capacitor-Reconfigured CIM Macro for Unified Acceleration of CNNs and Transformers

Kentaro Yoshioka

34.5 A 818-4094TOPS/W Capacitor-Reconfigured CIM Macro for Unified Acceleration of CNNs and Transformers

Kentaro Yoshioka

Published: 01 Jan 2024, Last Modified: 05 Mar 2025ISSCC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In the rapidly evolving landscape of machine learning, workloads using diverse neural-network architectures must be covered: including CNNs for image processing, transformers for natural language processing (NLP), and hybrid architectures that blend CNNs and transformers for audio processing. As illustrated in Fig. 34.5.1, these varied architectures have unique computational precision requirements. While CNNs achieve satisfactory accuracy even with low-computational precision, compute SNR or CSNR[1], transformers require higher CSNR to reach their full potential. This diversity amplifies the need for versatile hardware accelerators that can efficiently handle both CNNs and transformers, while meeting the multifaceted demands of modern machine-learning applications.

Loading