Keywords: Reinforcement Learning, Trajectory Prediction, Autonomous Driving
Abstract: While deep neural networks (DNN) and inverse reinforcement learning (IRL) have both been commonly used in autonomous driving to predict trajectories through learning from expert demonstrations, DNN-based methods suffer from data-scarcity, while IRL-based approaches often struggle with generalizability, making both hard to apply to new driving scenarios. To address these issues, we introduce EscIRL, a novel decoupled bi-level training framework that iteratively learns robust reward models from only a few mixed-scenario demonstrations. At the inner level, EscIRL introduces a self-contrastive IRL module that learns a spectrum of specialized reward functions by contrasting demonstrations across different scenarios. At the outer level, ESCIRL employs an evolving loop that iteratively refines the contrastive sets, ensuring global convergence. Experiments on two multi-scenario datasets, CitySim and INTERACTION, demonstrate the effectiveness of EscIRL, outperforming state-of-the-art DNN and IRL-based methods by 41.3% on average. Notably, we show that EscIRL achieves superior generalizability compared to DNN-based approaches while requiring only a small fraction of the data, effectively addressing data-scarcity constraints. All code and data are available at https://github.com/SiyueWang-CiDi/EscIRL.
Code: https://github.com/SiyueWang-CiDi/EscIRL
Publication Agreement: pdf
Student Paper: yes
Spotlight Video: mp4
Submission Number: 217
Loading