VMR2L: Virtual Machines Rescheduling Using Reinforcement Learning in Data Centers

VMR2L: Virtual Machines Rescheduling Using Reinforcement Learning in Data Centers

NeurIPS 2023 Workshop MLSys Submission37 Authors

Published: 28 Oct 2023, Last Modified: 12 Dec 2023MlSys Workshop NeurIPS 2023 OralPosterEveryoneRevisionsBibTeX

Keywords: Virtual Machines Rescheduling, Reinforcement Learning

Abstract: Modern industry-scale data centers receive thousands of virtual machine (VM) requests per minute. Due to the continual creation and release of VMs, many small resource fragments are scattered across physical machines (PMs). To handle these fragments, data centers periodically reschedule some VMs to alternative PMs. Despite the increasing importance of VM rescheduling as data centers grow in size, the problem remains understudied. We first show that, unlike most combinatorial optimization tasks, the inference time of VM rescheduling algorithms significantly influences their performance, causing many existing methods to scale poorly. Therefore, we develop a reinforcement learning system for VM rescheduling, VMR2L, which incorporates a set of customized techniques, such as a two-stage framework that accommodates diverse constraints and workload conditions as well as an effective feature extraction module. Our experiments on an industry-scale data center show that VMR2L can achieve a performance comparable to the optimal solution, but with a running time of seconds.

Submission Number: 37

Loading