CT-Mapper: Mapping sparse multimodal cellular trajectories using a multilayer transportation network
Mobile phone data have recently become an attractive source of information about mobility behavior. Since cell phone data can be captured in a passive way for a large user population, they can be harnessed to collect well-sampled mobility information. In this paper, we propose CT-Mapper , an unsupervised algorithm that enables the mapping of mobile phone traces over a multimodal transport network. One of the main strengths of CT-Mapper is its capability to map noisy sparse cellular multimodal
... ajectories over a multilayer transportation network where the layers have different physical properties and not only to map trajectories associated with a single layer. Such a network is modeled by a large multilayer graph in which the nodes correspond to metro/train stations or road intersections and edges correspond to connections between them. The mapping problem is modeled by an unsupervised HMM where the observations correspond to sparse user mobile trajectories and the hidden states to the multilayer graph nodes. The HMM is unsupervised as the transition and emission probabilities are inferred using respectively the physical transportation properties and the information on the spatial coverage of antenna base stations. To evaluate CT-Mapper we collected cellular traces with their corresponding GPS trajectories for a group of volunteer users in Paris and vicinity (France). We show that CT-Mapper is able to accurately retrieve the real cell phone user paths despite the sparsity of the observed trace trajectories. Furthermore our transition probability model is up to 20% more accurate than other naive models. (M.A. El-Yacoubi). ies used GPS to accurately sense spatial data with a localization error bound ≤50 m. Although it ensures the collection of finegrained mobility trajectories (as shown in Fig. 1 b) , GPS-based data collection has two main drawbacks: first, it causes high energy consumption, and second, it is constrained to a limited group of users (e.g., taxi drivers  or a group of car drivers  ). GPS sensing, therefore, is not suitable for collecting large-scale data from metropolitan area populations. By contrast, cellular data provided by network operators does not suffer from these issues, and has become recently, as a result, a new source of mobility information. Signaling information from mobile network operators (CDRs -Call Data Records) have been used as a valuable source of mobility information for large scale population [3, 6, 7] . Localization of mobile phone users with antennas (i.e., cellular towers), nonetheless, provides only coarse-grained mobility trajectories at antenna level, with a varying localization error of hundred meters in densely populated cities, and within several kilometers in rural areas  . Given the resulting cellular mobility trajectories (i.e., a sequence of antenna id s) and the location of each antenna as shown in Fig. 1 c, it might be difficult to observe the road or metro station that the user passes by (as shown in Fig. 1 a) .