Learning-Based Transport Control Adapted to Non-Stationarity for Real-Time Communication

Published: 01 Jan 2024, Last Modified: 13 Nov 2024IWQoS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The rapid development of real-time communications (RTC) has created many challenges for designing a proper transport control module, which determines how much media data can be sent in real time. Reinforcement learning (RL) -based transport control algorithms have shown great potential, but still face some unique challenges. For example, accurate bandwidth prediction is often necessary but it is difficult to guarantee accuracy due to bandwidth non-stationarity. In addition, how to alleviate the cold-start and overestimation problems of learning-based algorithms to achieve higher training efficiency is also a headache for researchers. In this work, we propose a new training framework that leverages the advanced Transformer model to capture the non-stationarity of the bandwidth sequence and improve the bandwidth prediction accuracy, while using knowledge distillation and transfer learning techniques to train the RL model efficiently and alleviate the cold-start problem of the model in the training environment. Besides, we employ the Double-Q learning mechanism to suppress the overestimation problem and further enhance the training efficiency. Based on this framework, we have trained a new RTC transport control algorithm NSAC and test it on our own platform. The experiments prove that NSAC adapts better to the unstable network environment than the state-of-the-art solutions. In conditions of weak network, the video throughput experiences a 14.11% increase, accompanied by reductions of 5.64%, 28.12%, and 25.86% in delay, loss rate, and stall rate, respectively. These improvements notably enhance the quality of user experience.
Loading