An End-to-End Programming Model for AI Engine Architectures

Maksim Levental; Arham Khan; Ryan Chard; Kyle Chard; Stephen Neuendorffer; Ian T. Foster

An End-to-End Programming Model for AI Engine Architectures

Maksim Levental, Arham Khan, Ryan Chard, Kyle Chard, Stephen Neuendorffer, Ian T. Foster

Published: 01 Jan 2024, Last Modified: 01 Oct 2024HEART 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Coarse-Grained Reconfigurable Architectures (CGRAs) are becoming a promising alternative to conventional computing architectures such as CPUs, GPUs, and FPGAs when energy efficiency and high performance are required. Like CPUs and GPUs, CGRAs have processing elements (PEs) that can perform complex operations, such as vectorized arithmetic, and like FPGAs, they support a reconfigurable topology of components. Because of their coarser grain reconfigurability, they are less challenging to program than FPGAs but more challenging than CPUs and GPUs. This paper presents an end-to-end programming model for AMD AI Engine CGRAs, which enables programming simultaneously at a high level and a very implementation-specific level, all in the same language, all in the same "flow". Our programming model allows users to specify, implement, and test on-device, enabling the productive design of dataflow programs for streaming applications. The programming model is open source and includes a language frontend (Python eDSL), an MLIR based compiler, export paths to target codegen compilers, and runtime infrastructure. We show that our approach to language and compiler design enables users to program with much less friction and ceremony while preserving access to all features and device APIs necessary for achieving performance competitive with existing AI Engine programming models.

Loading