Cooperative Action Branching Deep Reinforcement Learning for Uplink Power Control

Published: 01 Jan 2023, Last Modified: 14 Aug 2025EuCNC/6G Summit 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Wireless networks are expected to move towards self-sustaining networks in 6G, where machine learning plays a critical role in maintaining high performance in dynamically changing environment. It is expected that in such environment machine learning algorithms need to be able to optimize multiple correlated network parameters simultaneously. One example of such correlated parameters, that exist already in 5G, are outer-loop power control (OLPC) parameters. In this paper we show how a single double deep Q network (DDQN) can optimize both OLPC parameters, <tex>$P_{0}$</tex> and <tex>$\alpha_{pl}$</tex>, simultaneously with two output branch dimensions. In order to tackle multi-agent problems, caused by interference, we describe how multiple machine learning agents have to cooperate to optimize the performance of the whole network. Extensive simulation studies confirm that provided reinforcement learning methods can outperform the traditional way of setting OLPC parameters, where global sub-optimal parametrization is hand-picked. Because cooperating agents learned to discriminate some cells to maximize overall throughput, we additionally experiment methods that equalize the performance between learning agents. Thus, it can be envisioned that the introduced methods can be harnessed to learn also other combinations of multiple simultaneous optimization actions taken in the future 6G networks.
Loading