HeShare: Energy-Aware and Efficient Multi-Task GPU Sharing in Heterogeneous GPU-Based Computing Systems
Abstract: With the rapid growth of artificial intelligence and large-scale model computing, the demand for GPUs in datacenters continues to increase, especially for large-scale training and inference tasks. Heterogeneous multi-GPU systems, which integrate GPUs with varying types and computational capabilities, have become critical computing resources. This leads to two main challenges. First, due to the differences in GPU performance and power consumption, task scheduling involves a complex multi-objective optimization to balance energy efficiency and performance. More importantly, the lack of coordinated mechanisms for multi-task sharing and energy-efficient resource management across heterogeneous GPUs can result in GPU overload or underutilization, leading to wasted resources and potential system risks. To address these challenges, we propose HeShare, an energy-aware and efficient heterogeneous GPU framework for datacenters. First, we design an energy-aware task scheduling strategy that optimizes task allocation across different GPUs to achieve a balance between energy consumption and performance. Second, we introduce a GPU sharing optimization mechanism that adaptively configures MPS and DVFS settings for each GPU, enhancing resource utilization, reducing overall energy consumption, and ensuring task performance. Compared to the state-of-the-art framework, we reduce average energy costs by 26% and improve job completion time by 31%, achieving a balance between energy efficiency and performance.
External IDs:dblp:journals/tc/JiangCZZWQMGB26
Loading