MASE: An Efficient Representation for Software-Defined ML Hardware System Exploration

NeurIPS 2023 Workshop MLSys Submission19 Authors

Published: 28 Oct 2023, Last Modified: 12 Dec 2023MlSys Workshop NeurIPS 2023 PosterEveryoneRevisionsBibTeX
Keywords: intermediate representation, LLM, AI acceleration
TL;DR: An Efficient Representation for Software-Defined ML Hardware System Exploration
Abstract: Machine learning (ML) accelerators have been studied and used extensively to compute ML models with high performance and low power. However, designing such accelerators normally takes a long time and requires significant effort. Unfortunately, the pace of development of ML software models is much faster than the accelerator design cycle, leading to frequent and drastic modifications in the model architecture, thus rendering many accelerators obsolete. Existing design tools and frameworks can provide quick accelerator prototyping, but only for a limited range of models that can fit into a single hardware device, such as an FPGA. Furthermore, with the emergence of large language models, such as GPT-3, there is an increased need for hardware prototyping of these large models within a many-accelerator system to ensure the hardware can scale with the ever-growing model sizes. The design space is often huge, involving both software and hardware optimization. To address this, we propose a novel representation named MASE IR (Machine-learning Accelerator System Exploration Intermediate Representation) that describes data types, software algorithms, and hardware design constraints. MASEIR opens up opportunities for exploring software and hardware co-optimization at scale. As an application of MASEIR, we implemented a PyTorch-based framework named MASE that automatically optimizes and maps an ML model onto an efficient hardware accelerator system. We believe MASE IR will open new research opportunities for ML system design.
Submission Number: 19