Abstract: Experienced developers often leverage well-tuned libraries and allocate their routines for computing tasks to enhance performance when building modern scientific and engineering applications. However, such well-tuned libraries are meticulously customized for specific target architectures or environments. Additionally, the performance of their routines is significantly impacted by the actual input data of computing tasks, which often remains uncertain until runtime. Accordingly, statically allocating these library routines may hinder the adaptability of applications and compromise performance, particularly in the context of heterogeneous systems. To address this issue, we propose the Compiler-Assisted Adaptive Library Routines Allocation (COALA) framework for heterogeneous systems. COALA is a fully automated mechanism that employs compiler assistance for dynamic allocation of the most suitable routine to each computing task on heterogeneous systems. It allows the deployment of varying allocation policies tailored to specific optimization targets. During the application compilation process, COALA reconstructs computing tasks and inserts a probe for each of these tasks. Probes serve the purpose of conveying vital information about the requirements of each task, including its computing objective, data size, and computing flops, to a user-level allocation component at runtime. Subsequently, the allocation component utilizes the probe information along with the allocation policy to assign the most optimal library routine for executing the computing tasks. In our prototype, we further introduce and deploy a performance-oriented allocation policy founded on a machine learning-based performance evaluation method for library routines. Experimental verification and evaluation on two heterogeneous systems reveal that COALA can significantly improve application performance, with gains of up to 4.3x for numerical simulation software and 4.2x for machine learning applications, and enhance system utilization by up to 27.8%.
Loading