Last-mile Matters: Mitigating the Tail Latency of Virtualized Networks with Multipath Data PlaneDownload PDFOpen Website

Published: 01 Jan 2022, Last Modified: 12 May 2023CLUSTER 2022Readers: Everyone
Abstract: Virtualized network has become the cornerstone of today's large-scale cloud data centers. In particular, the data plane of virtualized network, consisting of virtual switch, virtual router and other software network functionalities, performs all network packets processing of virtual machines (VMs). However, current virtualized data plane solutions incur drastic performance interference with co-resident VMs, and thus suffer from unpredictable network performance, especially in terms of tail latency. In this work, we show that the performance issue stems from the fact that CPU plays a dual role of both communication and computation in virtualized networks. A number of virtual network components and their complex packets processing create an undue burden on the hosts' CPUs and in turn cause the mutual performance interference among VMs and networks. To address this issue, we present a multipath data plane solution, where the traffic of VMs can be adaptively and seamlessly offloaded to the adjacent hosts. At the core of this design is to optimize the VM traffic allocation among multiple paths. We formulate the VM multipath traffic allocation problem with coupled variables of computing and network resources, which were only considered as mutually independent in prior researches. Then we present a distributed algorithm to efficiently solve the large-scale, interdependent global optimization problem, with convergence and optimality guarantees. Through extensive simulations and real-world testbed experiments, we show that our solution delivers consistent performance improvement (up to <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$6.7\times$</tex> improvement in aggregate throughput and <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$21.4\times$</tex> reduction in tail latency, respectively) in the dynamic cloud system.
0 Replies

Loading