Trajectory Design and Access Control for Air-Ground Coordinated Communications System With Multiagent Deep Reinforcement Learning

Abstract: Unmanned-aerial-vehicle (UAV)-assisted communications has attracted increasing attention recently. This article investigates air–ground coordinated communications system, in which trajectories of air UAV base stations (UAV-BSs) and access control of ground users (GUs) are jointly optimized. We formulated this optimization problem as a mixed cooperative–competitive game, where each GU competes for the limited resources of UAV-BSs to maximize its own throughput by accessing a suitable UAV-BS, and UAV-BSs cooperate with each other and design their trajectories to maximize the defined <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">fair throughput</i> to improve the total throughput and keep the GU fairness. Moreover, the action space of GUs is discrete, while that of UAV-BS is continuous. To tackle this hybrid action space issue, we transform the discrete actions into continuous action probabilities and propose a multiagent deep reinforcement learning (MADRL) approach, named air–ground probabilistic multiagent deep deterministic policy gradient (AG-PMADDPG). With well-designed rewards, AG-PMADDPG can coordinate two types of agents, UAV-BSs and GUs, to achieve their own objectives based on local observations. Simulation results demonstrate that AG-PMADDPG can outperform the benchmark algorithms in terms of throughput and fairness.
0 Replies
Loading