Abstract: In recent years, the efficiency and even the feasibility of traditional load-balancing policies are challenged by the rapid growth of cloud infrastructure with increasing levels of server heterogeneity and increasing size of cloud services and applications. In such many software-load-balancers heterogeneous systems, traditional solutions, such as JSQ, incur an increasing communication overhead, whereas low-communication alternatives, such as JSQ(d) and the recently proposed JIQ scheme are either unstable or provide poor performance. We argue that a better low-communication load balancing scheme can be established by allowing each dispatcher to have a different view of the system and keep using JSQ, rather than greedily trying to avoid starvation on a per-decision basis. accordingly, we introduce the Loosely-Shortest -Queue family of load balancing algorithms. Roughly speaking, in Loosely-shortest -Queue, each dispatcher keeps a different approximation of the server queue lengths and routes jobs to the shortest among them. Communication is used only to update the approximations and make sure that they are not too far from the real queue lengths in expectation. We formally establish the strong stability of any Loosely-Shortest -Queue policy and provide an easy-to-verify sufficient condition for verifying that a policy is Loosely-Shortest -Queue. We further demonstrate that the Loosely-Shortest -Queue approach allows constructing throughput optimal policies with an arbitrarily low communication budget. Finally, using extensive simulations that consider homogeneous, heterogeneous and highly skewed heterogeneous systems in scenarios with a single dispatcher as well as with multiple dispatchers, we show that the examined Loosely-Shortest -Queue example policies are always stable as dictated by theory. Moreover, it exhibits an appealing performance and significantly outperforms well-known low-communication policies, such as JSQ(d) and JIQ, while using a similar communication budget.
Keywords: load balancing, multiple dispatchers, communication overhead, heterogeneous systems.
TL;DR: Scalable and low communication load balancing solution for heterogeneous-server multi-dispatcher systems with strong theoretical guarantees and promising empirical results.