Abstract: Highlights•We design a vehicle value function to optimize order matching and improve the long-term social welfare by using a reinforcement learning method.•We consider the dispatching of idle vehicles to a zone as a virtual order and process idle vehicle dispatching issue as (virtual) order matching as well.•We design a VCG based pricing method to prevent the strategic behavior of passengers and ensure positive profits for the platform.
Loading