Abstract: Highlights•Objective: Learning a multi-UAV control policy for comprehensive flood area coverage by leveraging domain knowledge.•Deep Reinforcement Learning: Autonomous UAV controls are acquired through DDPG, utilizing a novel target actor based on D-infinity flow estimation algorithm for directed explorations.•Avoiding clustering: Implementing a Path Scatter strategy to efficiently maintain an expanded multi-UAV spread.•Evaluation: Conducting extensive experiments to assess the algorithm’s performance against different baselines in simulated environments.
Loading