Optimal Dynamic Proactive Caching Via Reinforcement Learning

Alireza Sadeghi, Fatemeh Sheikholeslami, Georgios B. Giannakis

2018 (modified: 16 Apr 2023)SPAWC 2018Readers: Everyone

Abstract: Storage of popular reusable data at the edge of a heterogeneous wireless cellular network (HetNet) offers the premise of shifting the load on low-rate, unreliable backhaul links during peak traffic hours to off-peak periods. In order to intelligently capitalize on the limited available caching capacity, a content-agnostic small base station (SB) needs to proactively learn what and when to cache. An important challenge in a realistic network scenario is the spatio-temporal dynamics, inherent to the unknown content popularity profiles. To cope with such dynamics, local and global Markov processes are exploited to model user demands, whose structure and transition probabilities are assumed unknown. A reinforcement learning framework is put forth, through which a cache control unit (CCU) at the SB can continuously learn, track, and possibly adapt to the underlying dynamics of user demands. A Q-learning algorithm is developed to solve the proposed reinforcement learning task, unraveling the optimal caching policy in an online fashion. Simulated tests demonstrate the effectiveness of the proposed proactive caching scheme under spatio-temporal dynamic demands.

0 Replies