Fleet Supervisor Allocation: A Submodular Maximization Approach

Published: 05 Sept 2024, Last Modified: 05 Sept 2024CoRL 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Imitation Learning, Submodular Maximization, Fleet Learning
Abstract: In real-world scenarios, the data collected by robots in diverse and unpredictable environments is crucial for enhancing their models and policies. This data is predominantly collected under human supervision, particularly through imitation learning (IL), where robots learn complex tasks by observing human supervisors. However, the deployment of multiple robots and supervisors to accelerate the learning process often leads to data redundancy and inefficiencies, especially as the scale of robot fleets increases. Moreover, the reliance on teleoperation for supervision introduces additional challenges due to potential network connectivity issues. To address these inefficiencies and the reliability concerns of network-dependent supervision, we introduce an adaptive submodular maximization-based policy designed for efficient human supervision allocation within multi-robot systems under uncertain connectivity. Our approach significantly reduces data redundancy by balancing the informativeness and diversity of data collection, and is capable of accommodating connectivity variances. We evaluated the effectiveness of ASA in a simulation environment with 100 robots across four different environments and various network settings, including a real-world teleoperation scenario over a 5G network. We trained and tested both our and the state-of-the-art policies utilizing NVIDIA's Isaac Gym, and our results show that ASA enhances the return on human effort by up to $5.95\times$, outperforming current baselines in all simulated scenarios and providing robustness against connectivity disruptions.
Supplementary Material: zip
Submission Number: 582
Loading