Continuous Bitrate & Latency Control with Deep Reinforcement Learning for Live Video Streaming

Ruying Hong, Qiwei Shen, Lei Zhang, Jing Wang

Published: 2019, Last Modified: 13 Nov 2024ACM Multimedia 2019EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In this paper, we introduce a continuous bitrate control and latency control model for the Live Video Streaming Challenge. Our model is based on Deep Deterministic Policy Gradient, popular on continuous control tasks. Simultaneously, it can take a fine-grained control through continuous control and does not need to discrete the continuous "latency limit", which is a buffer threshold to minimize end-to-end delay by frame skipping. In all considered live video scenarios, our model can provide a better quality of experience with improvements in average QoE of 3.6% than DQN which discrete the "latency limit". Additionally, challenge results show the effectiveness and applicability of the proposed model, which achieved top performance in 3 different networks that include high, low and oscillating throughput, and ranked the second place in the network with medium throughput.