Keywords: Resource allocation, GPU cluster, heterogeneity, deep learning, integer linear programming
TL;DR: We propose a method for adaptive management of machine learning jobs, which uses two neural networks to cope with hardware utilization uncertainties.
Abstract: In heterogeneous clusters with varying capabilities and energy efficiency, sustainable use of mixed-generation resources is essential. We propose a method for adaptive management of machine learning jobs, aiming to minimize energy while meeting performance targets which uses two neural networks to cope with hardware utilization uncertainties. We demonstrate the efficacy of this adaptive process via
the Gavel benchmark [1].
Serve As Reviewer: ~Mahdi_Dolati1
Submission Number: 40
Loading