Pipirima: Predicting Patterns in Sparsity to Accelerate Matrix Algebra

Ubaid Bakhtiar, Donghyeon Joo, Bahar Asgari

Published: 2025, Last Modified: 09 May 2026DAC 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: While sparsity, a feature of data in many applications, provides optimization opportunities such as reducing unnecessary computations, data transfers, and storage, it causes several challenges, too. For instance, even in state-of-the-art sparse accelerators, sparsity can result in load imbalance; a performance bottleneck. To solve such challenges, our key insight is that if while reading/streaming compressed sparse matrices we can quickly anticipate the locations of the non-zero values in a sparse matrix, we can leverage this knowledge to accelerate processing sparse matrices. To enable this, we propose Pipirima, a lightweight prediction-based sparse accelerator. Inspired by traditional branch predictors, Pipirima uses resource-friendly simple counters to predict the patterns of non-zero values in the sparse matrices. We evaluate Pipirima based on sparse matrix vector multiplication (SpMV) and sparse matrix-dense matrix multiplication (SpMM) kernels on CSR compressed matrices derived from both scientific computing and transformer models. On average, our experiments show $6 \times$ and $4 \times$ speed up over Tensaurus for SpMM and SpMV, respectively on SuiteSparse workload. Pipirima also shows $40 \times$ speed up over ExTensor for SpMM. We achieve $8.3 \times$, $48.2 \times$ over Tensaurus and Extensor in lesser sparse transformer workloads. Piprima consumes $5.621 \mathrm{~mm}^{2}$ area and 544.93 mW power using 45 nm technology with predictor related components as the least expensive ones.
Loading