Improving Learnt Local MAPF Policies with Heuristic Search

Rishi Veerapaneni; Qian Wang; Kevin Ren; Arthur Jakobsson; Jiaoyang Li; Maxim Likhachev

Improving Learnt Local MAPF Policies with Heuristic Search

Rishi Veerapaneni, Qian Wang, Kevin Ren, Arthur Jakobsson, Jiaoyang Li, Maxim Likhachev

Published: 12 Feb 2024, Last Modified: 06 Mar 2024ICAPS 2024EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multi-agent path finding, learning, heuristic search

Abstract: Multi-agent path finding (MAPF) is the problem of finding collision-free paths for a team of agents to reach their goal locations. State-of-the-art classical MAPF solvers typically employ heuristic search to find solutions for hundreds of agents but are typically centralized and can struggle to scale to larger numbers of agents. Machine learning (ML) approaches that learn policies for each agent are appealing as these could be decentralized systems and scale well while maintaining good solution quality. Current ML approaches to MAPF have proposed methods that have started to scratch the surface of this potential. However, state-of-the-art ML approaches produce "local" policies that only plan for a single timestep and have poor success rates and scalability. Our main idea is that we can improve a ML local policy by using heuristic search methods on the output probability distribution to resolve deadlocks and enable full horizon planning. We show several model-agnostic ways to use heuristic search with ML that significantly improves the local ML policy's success rate and scalability. To our best knowledge, we demonstrate the first time ML-based MAPF approaches have scaled to similar high congestion (e.g. 40% agent density) as state-of-the-art heuristic search methods.

Primary Keywords: Learning, Multi-Agent Planning

Category: Long

Student: Graduate

Supplemtary Material: pdf

Submission Number: 37

Loading