Abstract: Reinforcement Learning (RL) based task scheduling approaches can be adopted in a Multi-access Edge Computing (MEC) based Internet of Things (IoT) network to address the inherent trade-off between energy consumption and deadline violation. However, such solutions may be sensitive in terms of energy consumption in the presence of varying task arrival rates. To this end, in this paper, we provide a Robust-Return Constrained Markov Decision Process (R2CMDP) formulation to minimize the worst-case total discounted power consumption subject to a constraint on the total discounted deadline violation. Since the task arrival statistics may be unknown, we propose a novel robust RL based task allocation algorithm. Contrary to traditional RL algorithms, the proposed robust RL based task allocation algorithm provides robust performance in the face of varying task arrival rates. We prove that the proposed algorithm converges to the optimal solution of the R2CMDP formulation. However, since the algorithm becomes computationally expensive as the size of the state space grows, we propose another robust learning task scheduling heuristic with significantly lower computational complexity. Network Simulator-3 (ns-3) based simulation results are presented to demonstrate the efficacy of the proposed algorithms in comparison to the state-of-the-art algorithms under realistic network scenarios.
Loading