Advancing RDMA Scalability With High Performance

Xijin Yin, Guo Chen, Xizheng Wang, Bin Wang, Huichen Dai, Bojie Li, Binzhang Fu, Kun Tan

Published: 2026, Last Modified: 02 Apr 2026IEEE Trans. Netw. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Due to its superior performance, Remote Direct Memory Access (RDMA) has been widely deployed in data center networks. It provides applications with ultra-high throughput, ultra-low latency, and far lower CPU utilization than TCP/IP software network stack. However, the connection states that must be stored on the RDMA NIC (RNIC) and the small NIC memory result in poor scalability. The performance drops significantly when the RNIC needs to maintain a large number of concurrent connections. We propose StaR (Stateless RDMA), which solves the scalability problem of RDMA by transferring states to the other communication end in a trusted network. Leveraging the asymmetric communication pattern in data center applications, StaRlets the communication end with low NIC memory usage to save states for the other end with high NIC memory usage, thus making the RNIC on the bottleneck side stateless. We implemented StaR on an FPGA board with a 10Gbps network port and NS-3, evaluating its performance on a testbed with 9 machines, each equipped with StaR NICs, and verified its scalability stability by conducting a larger-scale simulation with 200 fully connected nodes using a 100Gbps link. The experimental results show that in high concurrency scenarios, the throughput of StaR can reach up to 4.13x and 1.35x of the original RNIC and the latest software-based solution, respectively.
Loading