An OD Flow Clustering Method Based on Vector Constraints: A Case Study for Beijing Taxi Origin-Destination Data

Xiaogang Guo, Zhijie Xu, Jianqin Zhang, Jian Lu, Hao Zhang
2020 ISPRS International Journal of Geo-Information  
Origin-destination (OD) flow pattern mining is an important research method of urban dynamics, in which OD flow clustering analysis discovers the activity patterns of urban residents and mine the coupling relationship of urban subspace and dynamic causes. The existing flow clustering methods are limited by the spatial constraints of OD points, rely on the spatial similarity of geographical points, and lack in-depth analysis of high-dimensional flow characteristics, and therefore it is difficult
more » ... to find irregular flow clusters. In this paper, we propose an OD flow clustering method based on vector constraints (ODFCVC), which defines OD flow event point and OD flow vector to express the spatial location relationship and geometric flow behavior characteristics of OD flow. First, the OD flow vector coordinate system is normalized by the Euclidean distance-based OD flow event point spatial clustering, and then the OD flow clusters with similar flow patterns are mined using adjusted cosine similarity-based OD flow vector feature clustering. The transformation of OD data from point set space to vector space is realized by constraining the vector coordinate system and vector similarity through two-step clustering, which simplifies the calculation of high-dimensional similarity of OD flow and helps mining representative OD flow clusters in flow space. Due to the OD flow cluster property, the k-means algorithm is selected as the basic clustering logic in the two-step clustering method, and a sum of squared error perceptually important points algorithm considering silhouette coefficients (SSEPIP) is adopted to automatically extract the optimal cluster number without defining any parameters. Tested by origin-destination flow data in Beijing, China, new traffic flow communities based on traffic hubs are obtained by using the ODFCVC method, and irregular traffic flow clusters (including cluster mode, divergence mode, and convergence mode) with representative travel trends are found.
doi:10.3390/ijgi9020128 fatcat:jhoo32ndxzewtnfo6dc3lt54om