Keywords: Reinforcement Learning, Coordinated flight, Wildfire Suppression
Abstract: Wildfires are increasing in frequency and severity, creating urgent demand for safer and more efficient suppression methods than conventional crewed aircraft can provide. This work presents a multi-agent reinforcement learning (MARL) framework to coordinate swarms of unmanned aerial vehicles (UAVs) for adaptive wildfire suppression. A simulated fire environment models dynamic spread influenced by wind and terrain, while UAV agents operate under a decentralized policy trained with Proximal Policy Optimization. Each agent receives state inputs such as fire-front proximity, water level, wind direction, and neighbouring UAV positions, and selects actions including navigation, water release, refilling, and holding patterns. Compared with a rule-based baseline, the learned policies consistently achieve higher containment efficiency, fewer redundant water drops, and shorter suppression times. Emergent behaviours such as cooperative task sharing, non-overlapping coverage, and collision-free coordination arise without explicit programming. These results demonstrate that MARL enables robust, scalable, and autonomous UAV swarm control, offering a promising pathway toward next-generation aerial wildfire management.
Submission Number: 42
Loading