Attention-Based Population-Invariant Deep Reinforcement Learning for Collision-Free Flocking with A Scalable Fixed-Wing UAV Swarm

Chao Yan, Kin Huat Low, Xiaojia Xiang, Tianjiang Hu, Lincheng Shen

Published: 2022, Last Modified: 13 Nov 2023IROS 2022Readers: Everyone

Abstract: A swarm of fixed-wing unmanned aerial vehicles (UAVs) is expected to efficiently accomplish various tasks in complex scenarios. This paper proposes an attention-based population-invariant multi-agent deep reinforcement learning (MADRL) approach to deal with the decentralized collision-free flocking problem for a scalable fixed-wing UAV swarm. First, this problem is modeled as a decentralized partially observable Markov decision process from the perspective of each follower. Then, an improved multi-agent deep deterministic policy gradient (MADDPG) algorithm is presented to efficiently learn the population-invariant flocking policy. In this algorithm, the parameter sharing with ego-centric representation mechanism is incorporated to improve learning efficiency. Besides, the attention-based population-invariant network structure (APINet) is designed by leveraging the self-attention mechanism. With this structure, the learned flocking policy is invariant to the population of the swarm. Finally, both numerical and hardware-in-the-loop simulation results verify the efficiency and scalability of the proposed approach.

0 Replies