34.5 A 818-4094TOPS/W Capacitor-Reconfigured CIM Macro for Unified Acceleration of CNNs and Transformers
Abstract: In the rapidly evolving landscape of machine learning, workloads using diverse neural-network architectures must be covered: including CNNs for image processing, transformers for natural language processing (NLP), and hybrid architectures that blend CNNs and transformers for audio processing. As illustrated in Fig. 34.5.1, these varied architectures have unique computational precision requirements. While CNNs achieve satisfactory accuracy even with low-computational precision, compute SNR or CSNR[1], transformers require higher CSNR to reach their full potential. This diversity amplifies the need for versatile hardware accelerators that can efficiently handle both CNNs and transformers, while meeting the multifaceted demands of modern machine-learning applications.
Loading