Environment as Policy: Learning to Race in Unseen Tracks

Published: 22 May 2025, Last Modified: 22 May 2025RoboLetics 2.0 ICRA 2025 OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Autonomous Drone Racing, Reinforcement Learning, Environment Shaping
TL;DR: Dynamically tailor training tasks based on the agent's progress to enhance overall policy generalization.
Abstract: Reinforcement learning (RL) has achieved outstanding success in complex robot control tasks, such as drone racing, where the RL agents have outperformed human champions in a known racing track. However, these agents fail in unseen track configurations, always requiring complete retraining when presented with new track layouts. This work aims to develop RL agents that generalize effectively to novel track configurations without retraining. To enhance the generalizability of the RL agent, we propose an adaptive environment-shaping framework that dynamically adjusts the training environment based on the agent’s performance. We achieve this by leveraging a secondary RL policy to design environments that strike a balance between being challenging and achievable, allowing the agent to adapt and improve progressively. Using our adaptive environment shaping, one single racing policy efficiently learns to race in diverse, challenging tracks.
Submission Number: 5
Loading