A Smart Cache Content Update Policy Based on Deep Reinforcement LearningDownload PDFOpen Website

Published: 2020, Last Modified: 16 May 2023Wirel. Commun. Mob. Comput. 2020Readers: Everyone
Abstract: This paper proposes a DRL-based cache content update policy in the cache-enabled network to improve the cache hit ratio and reduce the average latency. In contrast to the existing policies, a more practical cache scenario is considered in this work, in which the content requests vary by both time and location. Considering the constraint of the limited cache capacity, the dynamic content update problem is modeled as a Markov decision process (MDP). Besides that, the deep Q-learning network (DQN) algorithm is utilised to solve the MDP problem. Specifically, the neural network is optimised to approximate the <svg xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://www.w3.org/2000/svg" style="vertical-align:-2.150701pt" id="M1" height="10.7866pt" version="1.1" viewBox="-0.0498162 -8.6359 9.52083 10.7866" width="9.52083pt"><g transform="matrix(.013,0,0,-0.013,0,0)"><path id="g113-82" d="M699 368C699 549 574 666 407 666C186 666 23 488 23 277C23 113 129 -3 288 -13L307 -26C431 -111 501 -139 533 -147C559 -154 613 -163 658 -164L666 -141C597 -111 507 -66 430 -11L416 -1C580 42 699 190 699 368ZM601 371C601 227 518 54 381 22L354 40L278 24C175 47 120 145 120 269C120 451 235 631 398 631C540 631 601 521 601 371Z"/></g></svg> value where the training data are chosen from the experience replay memory. The DQN agent derives the optimal policy for the cache decision. Compared with the existing policies, the simulation results show that our proposed policy is 56%–64% improved in terms of the cache hit ratio and 56%–59% decreased in terms of the average latency.
0 Replies

Loading