everyone
since 04 Oct 2024">EveryoneRevisionsBibTeXCC BY 4.0
Efficiently mapping tasks to processors and data to memories is a cornerstone of parallel programming to achieve high performance. Traditionally, this critical task has been handled by expert-crafted mapper programs, tailored for specific machine architectures and problem domains. However, creating customized mappers for each unique application is labor-intensive and time-consuming. Large language models (LLMs) have recently demonstrated remarkable capabilities in understanding and generating code, as well as in self-improvement for optimizing specific performance metrics. Inspired by these advancements, we introduce the task of mapper generation (MAGE), which frames generating high-performance mappers as a discrete optimization problem aimed at maximizing compute throughput. To solve this optimization problem, we leverage reinforcement learning (RL) to guide LLMs in the mapper generation process. At the core of our approach lies a novel domain-specific language (DSL), which provides a high-level interface for LLMs to generate the mapper code without getting entangled with complicated, low-level system programming. Moreover, our DSL defines a structured and constrained search space for RL to explore, guiding LLMs to discover the optimal mapping policy. The evaluation shows that our LLM-generated mappers can surpass expert-written mappers in performance, achieving up to 34% speedup across 9 benchmarks. Notably, our approach improves the throughput of parallel matrix multiplication algorithms by up to 31%, reducing development time from several days to just a few minutes.