Abstract: In the current 5G core network, the rigid TCP cannot satisfy the real-time service control requirements. The Quick UDP Internet Communication (QUIC) protocol is believed to be an emerging alternative to TCP in highly dynamic networks. Adopting multi-path protocols is an intuitive solution to improve the transmission rate and reliability. However, conventional schedulers are hard to adapt to complex and dynamic 5G/6G networks. In this article, we put forward a deep reinforcement learning (DRL) based path selection strat-egy, DeepPath, for the underlying multi-path QUIC (MPQUIC) protocol. DeepPath adopts a tailored Deep Q-Learning (DQN) algorithm to learn an optimal decision-making policy. To effectively model the dynamic and heterogeneity of the path, DeepPath takes complete account of the average of RTT, the standard deviation of RTT, the size of the congestion window, and the average queue length in the state setting. The double network is trained jointly to accurately fit the path value in the offline part. In the online part, the multi-path scheduler makes greedy decisions in real-time to get the maximum goodput. DeepPath is deployed on a natural 5G core network and evaluated in extensive experimental environments. The results prove that DeepPath outperforms existing multi-path schedulers in the metrics of delay.
Loading