Enhancing Neural Adaptive Wireless Video Streaming via Lower-Layer Information Exposure

Lingzhi Zhao, Ying Cui, Yuhang Jia, Yunfei Zhang, Klara Nahrstedt

Published: 01 Jan 2024, Last Modified: 15 Oct 2025ICC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Deep reinforcement learning (DRL) demonstrates its promising potential in the realm of adaptive video streaming. However, existing DRL-based methods for adaptive video streaming use only application (APP) layer information and adopt heuristic training methods. This paper aims to boost the quality of experience (QoE) of adaptive wireless video streaming by using lower-layer information and deriving a rigorous training method. First, we formulate a more comprehensive and accurate adaptive wireless video streaming problem as an infinite stage discounted Markov decision process (MDP) problem by additionally incorporating past and lower-layer information, allowing a flexible tradeoff between QoE and computational and memory costs for solving the problem. Then, we propose an enhanced asynchronous advantage actor-critic (eA3C) method by jointly optimizing the parameters of parameterized policy and value function. Specifically, we build an eA3C network consisting of a policy network and a value network that can utilize cross-layer, past, and current information and jointly train the eA3C network using pre-collected samples. Finally, experimental results show that the proposed eA3C method can improve the QoE by 6.8% $\sim$ 14.4% compared to the state-of-the-arts.