30.6 Vecim: A 289.13GOPS/W RISC-V Vector Co-Processor with Compute-in-Memory Vector Register File for Efficient High-Performance Computing

Yipeng Wang, Mengtian Yang, Chieh-Pu Lo, Jaydeep P. Kulkarni

Published: 2024, Last Modified: 18 Apr 2026ISSCC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Vector processors have re-emerged in high-performance computing and flagship mobile SoC designs for their improved programmability, appealing power efficiency over multicore processors and area efficiency over GPUs [1]. For modern data-parallel tasks (ML, DSP, vision), vector processors can approach the efficiency of custom designs, while maintaining the flexibility of CPUs and GPUs. However, there are still limitations to the widespread adoption of vector architectures in general-purpose systems: the expensive on-chip data movement, the high off-chip memory bandwidth requirement and the Vector Register File (VRF) complexity.
Loading