Learning Sub-Second Routing Optimization in Computer Networks requires Packet-Level Dynamics

Published: 15 Oct 2024, Last Modified: 15 Oct 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Finding efficient routes for data packets is an essential task in computer networking. The optimal routes depend greatly on the current network topology, state and traffic demand, and they can change within milliseconds. Reinforcement Learning can help to learn network representations that provide routing decisions for possibly novel situations. So far, this has commonly been done using fluid network models. We investigate their suitability for millisecond-scale adaptations with a range of traffic mixes and find that packet-level network models are necessary to capture true dynamics, in particular in the presence of TCP traffic. To this end, we present PackeRL, the first packet-level Reinforcement Learning environment for routing in generic network topologies. Our experiments confirm that learning-based strategies that have been trained in fluid environments do not generalize well to this more realistic, but more challenging setup. Hence, we also introduce two new algorithms for learning sub-second Routing Optimization. We present M-Slim, a dynamic shortest-path algorithm that excels at high traffic volumes but is computationally hard to scale to large network topologies, and FieldLines, a novel next-hop policy design that re-optimizes routing for any network topology within milliseconds without requiring any re-training. Both algorithms outperform current learning-based approaches as well as commonly used static baseline protocols, particularly in high-traffic volume scenarios. All findings are backed by extensive experiments in realistic network conditions in our fast and versatile training and evaluation framework.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Xi_Lin2
Submission Number: 3121
Loading