TaskExp: Enhancing Generalization of Multi-Robot Exploration with Multi-Task Pre-Training

Published: 2025, Last Modified: 05 Nov 2025ICRA 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We aim to develop a general multi-agent reinforcement learning (MARL) policy that enables a group of robots to efficiently explore large-scale, unknown environments with random pose initialization. Existing MARL-based multi-robot exploration methods face challenges in reliably mapping observations to actions in large-scale scenarios and lack of zero-shot generalization to unknown environments. To this end, we propose a generic multi-task pre-training algorithm (termed TaskExp) to enhance the generalization of learning-based policies. In particular, we design a decision-related task to guide the policy to focus on valuable subspaces of the action space, improving the reliability of policy mapping. Moreover, two perception-related tasks-Location Estimation and Map Prediction-are designed to enhance the zero-shot capability of the policy by guiding it to extract general invariant features from unknown environments. With TaskExp pre-training, our policy significantly outperforms state-of-the-art planning-based methods in large-scale scenarios and demonstrates strong zero-shot performance in unseen environments. Furthermore, TaskExp can also be easily integrated to improve the existing learning-based multi-robot exploration methods.
Loading