Deviation Backfilling: A Robust Backfilling Scheme for Improving the Efficiency of Job Scheduling on High Performance Computing Systems

Published: 01 Jan 2023, Last Modified: 11 Nov 2024ACOMPA 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The Big Data era has led to the rapid development of modern sciences. Making effective use of extremely large amounts of collected data is putting extremely urgent demands on High Performance Computing (HPC) systems, especially the resource allocation process, which is commonly managed by the job scheduler. Despite the potential for improved scheduling outcomes offered by modern algorithms, these systems still continue to favor the FCFS and EASY backfilling approach due to its straightforward implementation and fair resource allocation. In this paper, we propose a novel and robust backfilling scheme called Deviation Backfilling to enhance the efficiency of job scheduling on HPC systems while maintaining user fairness. Combined with the job runtime prediction method using kNN, the Deviation Backfilling uses the deviation between user estimates and system prediction as the delay threshold for the first job at the backfilling queue. Our experimental results show that Deviation Backfilling outperforms existing scheduling strategies regarding performance metrics on real-world datasets. This performance improvement also signifies the capability to motivate users to offer more accurate estimates for assisting the HPC scheduling procedures.
Loading