Abstract: Shortest path distances between node pairs on road networks are essential for many applications. Traditional methods, such as breadth first search (BFS) and Dijkstra algorithm, focus on precise result. However, they are difficult to apply to the large-scale road network because of the high time cost. And some methods precomputed the shortest path of all node pairs and stored them, then answer distance queries by simple lookups. But these methods need high space cost. For some applications, such as finding nearest point of interest (POI) for travel recommendations, which only need the approximate distances. Therefore, it is important to find a method to estimate the shortest path distance timely with low time and space cost. In this paper, we use CatBoost, a machine learning method based on gradient boosting decision trees, to estimate the shortest distance. We first obtain the node features based on landmarks and combine them with the longitude and latitude of the node and the real Euclidean distance between node pairs as the node features. Then we fed them into CatBoost model to train the model. The space complexity is O(k|V|) and the query time complexity is O(1). We conduct experiments on the real road network in Xiamen and New York City. Experiments demonstrate that our model can estimate the shortest distance with low error and small time cost and space cost.
0 Replies
Loading