DASH: Data-Efficient Learned Cost Models for Sparse Matrix Computations on Emerging Hardware Platforms

Chamika Sudusinghe; Gerasimos Gerogiannis; Damitha Lenadora; Charles Block; Josep Torrellas; Charith Mendis

DASH: Data-Efficient Learned Cost Models for Sparse Matrix Computations on Emerging Hardware Platforms

Chamika Sudusinghe, Gerasimos Gerogiannis, Damitha Lenadora, Charles Block, Josep Torrellas, Charith Mendis

28 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: learned cost models, sparse accelerators, transfer learning, ml for systems

TL;DR: We propose a framework that utilizes transfer learning to develop data-efficient learned cost models to optimize sparse matrix computations on emerging hardware.

Abstract: Sparse matrix computations are becoming increasingly significant in deep learning and graph analytics, driving the development of specialized hardware systems known as accelerators to meet the growing need for optimized performance. Optimizing these computations, however, presents significant challenges due to their sensitivity to variations in input sparsity patterns and code optimizations. While ML-based cost models and search techniques have shown promise in optimizing sparse matrix computations in general-purpose hardware like CPUs, these cost models require large datasets for effective training. Collecting such extensive datasets is particularly impractical for emerging hardware platforms that only have access to expensive simulators in the early design stages. To overcome this, we propose DASH, which trains learned cost models using low-cost data samples from widely accessible general-purpose hardware (such as CPUs), followed by few-shot fine-tuning to efficiently adapt to emerging hardware platforms. DASH introduces a novel approach that leverages the homogeneity of input features across different hardware platforms while effectively mitigating heterogeneity. This enables DASH to achieve comparable accuracy using only 5% of the data samples required by a cost model trained exclusively using data samples from an accelerator. We evaluate DASH on two critical sparse operations—SpMM and SDDMM—on an emerging sparse accelerator using 715 distinct sparsity patterns. Our experimental results show that DASH outperforms existing techniques that use transfer learning by 28.44%, achieving average speedups of 1.47x (up to 5.46x) for SpMM and 1.39x (up to 4.22x) for SDDMM.

Primary Area: other topics in machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 13448

Loading