muRISCV-NN: Challenging Zve32x Autovectorization with TinyML Inference Library for RISC-V Vector Extension

Published: 01 Jan 2024, Last Modified: 07 Mar 2025CF (Companion) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: With the rapid adoption of deep learning workloads to resource-constrained edge devices, efficient and data-parallel computing paradigms are becoming increasingly important. The RISC-V ISA provides a set of vector extensions featuring powerful data computation capabilities to accelerate deep learning workloads at the edge. However, the RISC-V ecosystem lacks a lightweight, open-source, and vendor-agnostic compute library to support these extensions on embedded platforms. After porting the existing ARM Cortex-M specific kernel implementation to the RISC-V vector ISA, we optimized the operator implementations to make the most out of the data-level parallelism provided by supported targets. In comparison to programs vectorized by LLVM's built-in auto-vectorizer, we see an up to 60% advantage in runtime for convolutional models and large vectors while introducing less ROM overheads. Furthermore, muRISCV-NN integrates well with existing ML deployment frameworks, is bit-accurate to CMSIS-NN, and can, thus, be used as a drop-in replacement with minimal changes to the compilation flow.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview