Policy Learning For Video Streaming

Ofir Birka; Yedid Hoshen; Michael Schapira

Policy Learning For Video Streaming

Ofir Birka, Yedid Hoshen, Michael Schapira

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Supplementary Material: zip

Primary Area: reinforcement learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Adaptive video bitrate (ABR), video streaming, policy learning, Quality of Experience (QoE)

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: We improve video streaming quality through policy learning

Abstract: Facilitating good quality of experience (QoE) for Internet-based video services is a crucial real-world challenge. With remote/hybrid work, education, and telemedicine being here to stay, poor video quality adversely impacts the economy and society at large. The key algorithmic challenge in this context is adaptive bitrate selection (ABR) - continuously adjusting the video bitrate (resolution) to the prevailing traffic conditions. ABR algorithms struggle to maintain high resolutions while avoiding video stalls and long "lags behind live'', and are the subject of extensive attention. In particular, ABR has, in recent years, been approached from different ML perspectives. However, disillusionment with applications of end-to-end deep reinforcement learning (DRL) to ABR have effectively led to abandoning policy learning for ABR altogether in favor of control-theoretic optimization methods. We demonstrate that, through more nuanced policy learning, substantial improvement over the state-of-the-art is achievable. Specifically, we show that applying deep-Q-learning to the output of a supervised predictive model bests alternative approaches. As we believe that the ABR domain is an exciting new playground for policy learning, we release our code for ABR policy learning and experimentation to facilitate further research.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 5865

Loading