Abstract: Thanks to the rapid development of mobile sensing techniques, massive human-generated spatial-temporal data (HSTD) are generated from the urban areas, e.g., passenger-seeking trajectories from taxi drivers, and public transit trips from urban dwellers. These HSTD record sequential decisions made by human agents. Studying human behavior from HSTD provides benefits to many aspects, for example, studying passenger-seeking strategies from experienced taxi drivers can help improve the operation efficiencies of those new drivers. One common method to analyze human behavior from HSTD is Imitation Learning (IL). Existing IL approaches rely on data collected from experts. However, human agents who generate HSTD may have diverse expertise levels across geographical regions, i.e., with good policies in some regions and poor policies in less experienced regions. The problem of how to infer the optimal policy for agents in their unfamiliar or less-experienced regions remains open. In this paper, we propose the novel Generative Adversarial Imitation Learning for Non-experts (NEXT-GAIL) framework to first disentangle expert knowledge, which is irrelevant to spatial-temporal regions, from the demonstration data. Then, such knowledge can be transferred to regions, where the agent does not possess an expert policy. We take the real-world taxi trajectory data as an example to evaluate the performance of our proposed framework. The comparison results illustrate that our proposed NEXT-GAIL outperforms existing state-of-the-art approaches regarding the accuracy of the inferred optimal policy for non-experts.
0 Replies
Loading