Deep Reinforcement Learning of Collision-Free Flocking Policies for Multiple Fixed-Wing UAVs Using Local Situation MapsDownload PDFOpen Website

Published: 01 Jan 2022, Last Modified: 13 Nov 2023IEEE Trans. Ind. Informatics 2022Readers: Everyone
Abstract: The evolution of artificial intelligence and Internet of Things (IoT) envision a highly integrated artificial IoT (AIoT) network. Flocking and cooperation with multiple unmanned aerial vehicles (UAVs) are expected to play a vital role in industrial AIoT networks. In this article, we formulate the collision-free flocking problem of fixed-wing UAVs as a Markov decision process and solve it in the deep reinforcement learning (DRL) framework. Our method can deal with a variable number of followers by encoding the dynamic environmental state into a fixed-length embedding tensor. Specifically, each follower constructs a fixed-size local situation map that describes the collision risks with other followers nearby. The local situation maps are used by a proposed DRL algorithm to learn the collision-free flocking behavior. To further improve the learning efficiency, we design a reference-point-based action selection strategy and an adaptive mechanism. We compare the proposed MA2D3QN algorithm with several benchmark DRL algorithms through numerical simulation, and we verify its advantages in learning efficiency and performance. Finally, we demonstrate the scalability and adaptability of MA2D3QN in a semiphysical simulation experiment.
0 Replies

Loading