Energy-Efficient RISC-V-Based Vector Processor for Cache-Aware Structurally-Pruned Transformers

Published: 2023, Last Modified: 16 May 2025ISLPED 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Based on recent RISC-V designs, we present in this paper a low-power vector processor architecture for efficiently deploying vision transformer (ViT) models. To fairly measure the processing efficiency of different processor designs with instruction/data cache memories, we first develop the evaluation framework based on numerous design tools for jointly considering the algorithm, architecture, and circuit performances together, numerically revealing that the previous CSR-based data compression cannot accelerate pruned transformer models at all due to under-utilization of the vector-extended processing units. We then introduce a series of algorithm-hardware co-optimization approaches to greatly minimize cache misses by applying 1) the accuracy-preserved structured ViT pruning, 2) the vertical-CSR (vCSR) data storing format, and 3) vCSR-aware custom memory-accessing instructions. Experimental results show that the proposed optimization schemes eventually improve the processing efficiency of pruned transformers in resource-limited computing platforms, e.g., achieving 11 times lower energy consumption for handling the 0.7-pruned ViT model.
Loading