Contrastive Inverse Reinforcement Learning for Highway Driving Behavior Optimization

Contrastive Inverse Reinforcement Learning for Highway Driving Behavior Optimization

ICLR 2026 Conference Submission21273 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Inverse reinforcement learning, contrastive learning, highway drving behavior, driving optimization

Abstract: Autonomous driving systems are expected to not only replicate proper human driving behavior, but also adapt to dynamic driving scenarios. Imitation learning (IL) and inverse reinforcement learning (IRL) methods are potential tools to reproduce human behaviors. Traditional IRL methods are not highly sample-efficient and sometimes generalize poorly, especially in autonomous driving with limited vehicle demonstrations and driving behavior distribution shifts. In this paper, we propose a Contrastive Inverse Reinforcement Learning (CIRL) framework that enhances reward learning via self-supervised contrastive representations. The proposed CIRL method improves efficiency and robustness by 1) integrating reward regularization into the contrastive loss and 2) employing momentum encoders to stabilize contrastive feature learning under driving-specific perturbations. Furthermore, our approach supports personalized driving policies by modeling individual driving styles using a small number of vehicle demonstration data. Extensive experiments on the NGSIM US-101 and I-80 highway dataset demonstrate that the proposed CIRL framework consistently outperforms state-of-the-art IRL methods, achieving improvements of 12.5\% in human-likeness, 86.2\% in safety, and 17.8\% in generalization to new environments. In addition, the ablation study of key designs further validates the necessity of each key component, confirming that momentum encoding, reward regularization, and learnable similarity functions collectively contribute to CIRL’s robust and generalizable performance in real-world driving scenarios.

Primary Area: reinforcement learning

Submission Number: 21273

Loading