Leader-Follower Coordination in UAV Swarms for Autonomous 3D Exploration via Reinforcement Learning

Robert Kathrein, Julian Bialas, Mohammad Reza Mohebbi, Simone Walch, Mario Döller, Kenneth Hakr

Published: 2025, Last Modified: 28 Feb 2026ICINCO (1) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Autonomous volumetric scanning in three-dimensional environments is critical for environmental monitoring, infrastructure inspection, and search and rescue applications. Efficient coordination of multiple Unmanned Aerial Vehicles (UAVs) is essential to achieving complete and energy-aware coverage of complex spaces. In this work, a Reinforcement Learning (RL)-based framework is proposed for the coordination of a leader-follower UAV system performing volumetric scanning. The system consists of two heterogeneous UAVs with directional sensors and constant mutual orientation during the mission. A centralized control policy is learned based on Proximal Policy Optimization (PPO) to control the leader UAV, which produces trajectory commands for the follower to achieve synchronized movement and effective space coverage. The observation space includes a local 3D occupancy map of the leader and both UAVs’ battery levels, enabling energy-aware decision-making. The reward function is carefully de

External IDs:dblp:conf/icinco/KathreinBMWDH25