StaR: Breaking the Scalability Limit for RDMA

Published: 01 Jan 2021, Last Modified: 05 May 2025ICNP 2021EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Due to its superior performance, Remote Direct Memory Access (RDMA) has been widely deployed in data center networks. It provides applications with ultra-high throughput, ultra-low latency, and far lower CPU utilization than TCP/IP software network stack. However, the connection states that must be stored on the RDMA NIC (RNIC) and the small NIC memory result in poor scalability. The performance drops significantly when the RNIC needs to maintain a large number of concurrent connections.We propose StaR (Stateless RDMA), which solves the scalability problem of RDMA by transferring states to the other communication end. Leveraging the asymmetric communication pattern in data center applications, StaR lets the communication end with low concurrency save states for the other end with high concurrency, thus making the RNIC on the bottleneck side to be stateless. We have implemented StaR on an FPGA board with 10Gbps network port and evaluated its performance on a testbed with 9 machines all equipped with StaR NICs. The experimental results show that in high concurrency scenarios, the throughput of StaR can reach up to 4.13x and 1.35x of the original RNIC and the latest software-based solution, respectively.
Loading