Keywords: Reinforcement learning, TD3, DDPG, SAC, Autonomous driving
TL;DR: Implementation of deep Q learning method in virtual autonomous driving environment
Abstract: Autonomous driving is becoming the trend for future transportation. One of its most
significant challenges is to recognize traffic signs and obey traffic rules specified
by the signs. In this paper, the particular topic of optimal vehicle speed control
whenever a vehicle reaches a speed limit sign is studied. This research is conducted
in a longitudinal environment, which only has one dimension, the straight traffic
lane, along which a vehicle will drive through. There will be multiple traffic
signs in the traffic lane, which the vehicle can recognize 150 meters in advance.
Three factors are taken into account during control optimization, that is, energy
consumption of the vehicle, jerk (change of acceleration), and most importantly,
speed of the vehicle. Methods of Q learning with the deep neural network are
implemented in the research for speed limit control. More concisely, three deep
learning methods are implemented, which are DDPG, TD3 and SAC. The last
of the three is the default method for this environment and thus will serve as the
baseline. Experiment results show that TD3 and SAC algorithms both achieved
comparably high performance within five-hour training span. Specifically, TD3
gained larger policy improvement per epoch of training while SAC achieved faster
training efficiency. On the other hand, DDPG achieved worse performance than
the other two algorithms due to its instability during training and slow training
efficiency.
3 Replies
Loading