CAC: Enabling Customer-Centered Passenger-Seeking for Self-Driving Ride Service with Conservative Actor-Critic
Abstract: Rapid advances in perception, planning, and decision-making areas for self-driving vehicles have led to great improvements in their function and capabilities and enabled several prototypes to be driving on the roads and streets, such as Waymo Driver, TuSimple, Nuro, etc. Among various applications of self-driving vehicles, a promising one is the ride service as it has the potential to improve service quality and productivity and to provide service to anyone at any time. Extensive studies have been conducted on self-driving planning and safety, but few works focus on self-driving ride service decision-making and routing. In this work, we take the lead to study self-driving ride service planning and decision-making problem leveraging human-generated spatial-temporal data, and propose the data-driven Conservative Actor-Critic approach – CAC – based on offline reinforcement learning. Our CAC is able to make conservative decisions in a complicated environment with multiple goal states, and avoid dangerous and overly optimistic behaviors by exploiting human decisions. Extensive experiments with real-world data demonstrate that our CAC-learned policies are able to improve taxi service operation efficiency and quality drastically in terms of shortening passenger waiting time and improving service revenue.
Loading