HeShare: Energy-Aware and Efficient Multi-Task GPU Sharing in Heterogeneous GPU-Based Computing Systems

Zhuolong Jiang, Zinuo Cai, Hongyu Zhao, Baoheng Zhang, Tianqi Wu, Yiming Qiang, Ruhui Ma, Haibing Guan, Rajkumar Buyya

Published: 2026, Last Modified: 18 Mar 2026IEEE Trans. Computers 2026EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: With the rapid growth of artificial intelligence and large-scale model computing, the demand for GPUs in datacenters continues to increase, especially for large-scale training and inference tasks. Heterogeneous multi-GPU systems, which integrate GPUs with varying types and computational capabilities, have become critical computing resources. This leads to two main challenges. First, due to the differences in GPU performance and power consumption, task scheduling involves a complex multi-objective optimization to balance energy efficiency and performance. More importantly, the lack of coordinated mechanisms for multi-task sharing and energy-efficient resource management across heterogeneous GPUs can result in GPU overload or underutilization, leading to wasted resources and potential system risks. To address these challenges, we propose HeShare, an energy-aware and efficient heterogeneous GPU framework for datacenters. First, we design an energy-aware task scheduling strategy that optimizes task allocation across different GPUs to achieve a balance between energy consumption and performance. Second, we introduce a GPU sharing optimization mechanism that adaptively configures MPS and DVFS settings for each GPU, enhancing resource utilization, reducing overall energy consumption, and ensuring task performance. Compared to the state-of-the-art framework, we reduce average energy costs by 26% and improve job completion time by 31%, achieving a balance between energy efficiency and performance.
Loading