Machine learning for load balancing in the Linux kernel

Jingde Chen, Subho S. Banerjee, Zbigniew T. Kalbarczyk, Ravishankar K. Iyer

Published: 01 Jan 2020, Last Modified: 20 Aug 2025APSys 2020EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The OS load balancing algorithm governs the performance gains provided by a multiprocessor computer system. The Linux's Completely Fair Scheduler (CFS) scheduler tracks process loads by average CPU utilization to balance workload between processor cores. That approach maximizes the utilization of processing time but overlooks the contention for lower-level hardware resources. In servers running compute-intensive workloads, an imbalanced need for limited computing resources hinders execution performance. This paper solves the above problem using a machine learning (ML)-based resource-aware load balancer. We describe (1) low-overhead methods for collecting training data; (2) an ML model based on a multi-layer perceptron model that imitates the CFS load balancer based on the collected training data; and (3) an in-kernel implementation of inference on the model. Our experiments demonstrate that the proposed model has an accuracy of 99% in making migration decisions and while only increasing the latency by 1.9 μs.