REPlanner: Efficient UAV Trajectory-Planning using Economic Reinforcement Learning

Alvi Ataur Khalil, Alexander J. Byrne, Mohammad Ashiqur Rahman, Mohammad Hossein Manshaei

Published: 01 Jan 2021, Last Modified: 18 Jun 2024SMARTCOMP 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Advances in the unmanned aerial vehicle (UAV) design and capability, as well as decreases in the manufacturing cost, have opened up applications of UAVs in various fields, including surveillance, firefighting, cellular networks, and delivery purposes. The uniqueness of UAVs in systems creates a novel set of trajectory or path planning and coordination problems. Environments include many more points of interest (POIs) than UAVs, with obstacles and no-fly zones. We introduce REPlanner, a novel multi-agent reinforcement learning algorithm inspired by economic transactions to distribute tasks among UAVs. This system revolves around an economic theory, in particular an auction mechanism where UAVs trade assigned POIs. We formulate the path planning problem as a multi-agent economic game, where agents can cooperate and compete for resources. We then translate the problem into a partially observable Markov decision process (POMDP), which is solved using a reinforcement learning (RL) model deployed on each agent. As the system computes task distributions via UAV cooperation, it is highly resilient to any change in the swarm size. Our proposed network and economic game architecture can effectively coordinate the swarm as an emergent phenomenon while maintaining the swarm’s operation. Evaluation results prove that REPlanner efficiently outperforms conventional RL-based trajectory search.