A two-dimensional clustering method for highspeed railway trains in China based on train characteristics and operational performance
At present, China has the world's longest high-speed railway (HSR) network, but most trains do not have clear market positioning and hierarchical standard. The current train hierarchies and adjustment is empirically set by HSR organisers. This paper applies a clustering method and cross-over analysis to study the classification of China's HSR trains and provides scientific operation suggestions. By adopting the timetable data and ticket booking data of Shanghai-Nanjing intercity railway, we
... ity railway, we establish a clustering index system consists of train characteristics indexes and operational performance indexes, then use t-distributed Stochastic Neighbour Embedding (t-SNE) to do dimensionality reduction for original data. We obtain the optimal clustering number by validity indexes and use k-means to cluster the HSR trains. After clustering, we use a cross-over analysis to illustrate the relationship between train characteristics, passenger demand and operational performance. We find there are two main types of train on Shanghai-Nanjing intercity HSR: trains departing on the hour that have good operational performance, trains with staggered stops and low-capacity based on the strategies of "low capacity, high density" that have better operational performance at peak time. After 19:00, due to the passenger demand decrease, train capacity can be reduced to avoid unnecessary waste. Short-distance trains could easily be replaced by long-distance trains with similar stop schedules and need to maintain a certain operation frequency. The proposed clustering method has universal applicability for Chinese HSR lines. INDEX TERMS High-speed railway train, clustering analysis, train characteristics, operational performance, cross-over analysis.