Enhancing the Understanding of Train Delays With Delay Evolution Pattern Discovery: A Clustering and Bayesian Network Approach

Abstract: Train delay evolutions exhibit different patterns (i.e., increasing delays, decreasing delays, or unchanged delays), because of the effects of stochastic disturbances and pre-scheduled supplement/recovery times. The dynamics and uncertainty of the train delay evolution make train delay prediction a challenging task. This study presents a hybrid framework, called context-driven Bayesian network (CDBN), composed of a delay evolution pattern discovery model, i.e., a K-Means clustering approach, and a train delay prediction model, i.e., Bayesian network (BN), to address this problem. The clustering algorithm is used to uncover the delay evolution patterns, and classify the data into different categories, based on the delay jumps, i.e., the change of a delay from one station to a consequent station. The BN model, which considers the delays in previous stations to overcome the Markov property assumption, is used as the predictive model of train delays. The data in each category (classified by the clustering model) are used to train and test the BN model separately. We evaluated the BN model, the clustering algorithm, and the CDBN model, by comparing against their counterparts, respectively. The results show that: (1) the proposed BN structure has advantages over the common delay prediction models built on Markov property; (2) the clustering is effective, and it can extensively improve the accuracy of the predictive model; and (3) the CDBN outperforms the existing delay prediction models in wide usability, because of its more profound understanding of the delay evolution patterns.
0 Replies
Loading