Keywords: MARL, pursuit-evasion, graph attention, path planning
TL;DR: We propose a neural framework for visibility-based pursuit-evasion to learn a coordinated yet distributed policy for multiple agents to effectively search for worst-case evaders, resulting in a significantly improved success rate across various maps.
Abstract: In visibility-based pursuit-evasion tasks, a team of mobile pursuer robots with limited sensing capabilities is tasked with detecting all evaders in a multiply-connected planar environment, whose map may or may not be known to pursuers beforehand. This requires tight coordination among multiple agents to ensure that the omniscient and potentially arbitrarily fast evaders are guaranteed to be detected by the pursuers. Whereas existing methods typically rely on a relatively large team of agents to clear the environment, we propose ViPER, a neural solution that leverages a graph attention network to learn a coordinated yet distributed policy via multi-agent reinforcement learning (MARL). We experimentally demonstrate that ViPER significantly outperforms other state-of-the-art non-learning planners, showcasing its emergent coordinated behaviors and adaptability to more challenging scenarios and various team sizes, and finally deploy its learned policies on hardware in an aerial search task.
Supplementary Material: zip
Submission Number: 664
Loading