Optimize Next State Prediction in Safe RL for 5G Ecosystem

Satheesh K. Perepu, M. Saravanan

Published: 01 Jan 2022, Last Modified: 09 May 2023COMSNETS 2022Readers: Everyone

Abstract: Traditional supervised learning techniques need fairly a big number of data and computational power to train models, which is unavailable in the mobile ecosystem. However, in many situations, big data collection is completely difficult, and one may resort to Reinforcement Learning (RL) which enables the agent to learn from scratch by interacting with the environment. Since the agent learns from scratch by taking random actions, there is a possibility that an environment may transit to an unsafe state. In literature, we can find some methods which predict the transition to an unsafe state in advance and ensure that the specific action will not be executed. However, the underlying process is stochastic, and normal prediction methods will give poor accuracy. In this paper, we propose a method to predict the possible next transition state assuming that the labels are noisy. Results show that the proposed method can predict the next state with higher accuracy when compared with existing methods by considering the underlying stochasticity. The value addition of the proposed method is that it will make agents converge faster compared to traditional RL methods since we are modifying the expected trajectories in advance, and it can save a good number of computational resources. Three case studies related to 5G mobile networks were considered to explore the advantage of the proposed method.

0 Replies