Deep Surrogate Q-Learning for Autonomous Driving

Maria Kalweit, Gabriel Kalweit, Moritz Werling, Joschka Boedecker

2022 (modified: 22 Nov 2022)ICRA 2022Readers: Everyone

Abstract: Open challenges for deep reinforcement learning systems are their adaptivity to changing environments and their efficiency w.r.t. computational resources and data. In the application of learning lane-change behavior for autonomous driving, the number of required transitions imposes a bottleneck, since test drivers cannot perform an arbitrary amount of lane changes in the real world. In the off-policy setting, additional information on solving the task can be gained by observing actions from others. While in the classical RL setup this knowledge remains unused, we use other drivers as surrogates to learn the agent's value function more efficiently. We propose Surrogate Q-learning that deals with the aforementioned problems and reduces the required driving time drastically. We further propose an efficient implementation based on a permutation equivariant deep neural network architecture of the Q-function to estimate action-values for a variable number of vehicles in sensor range. We evaluate our method in the open traffic simulator SUMO and learn well performing driving policies on the real highD dataset.

0 Replies