Abstract: Sparse Triangular Solve (SpTRSV) is a critical level2 kernel in sparse Basic Linear Algebra Subprograms (BLAS). While Field-Programmable Gate Array (FPGA) accelerators for SpTRSV focus on optimizing individual tiles, they overlook intertile parallelism. Designing an inter-tile parallelism accelerator poses challenges, including constructing fine-grained dependency graph, handling communication overhead, and balancing workloads. HiSpTRSV addresses these challenges through dependency graph parsing, tile-based highly parallel algorithm, filtering mechanisms, and bidirectional matching with modular indexing. Experiments show that HiSpTRSV outperforms the state-of-the-art SpTRSV accelerator in terms of a 34.3% performance improvement. HiSpTRSV achieves a $3.58 \times$ speedup and $9.59 \times$ higher energy efficiency compared to GPUs.
External IDs:dblp:conf/dac/SunDS25
Loading