Semi-Supervised Deep Learning Approach for Transportation Mode Identification Using GPS Trajectory Data
IEEE Transactions on Knowledge and Data Engineering
Identification of travelers' transportation modes is a fundamental step for various problems that arise in the domain of transportation such as travel demand analysis, transport planning, and traffic management. This thesis aims to identify travelers' transportation modes purely based on their GPS trajectories. First, a segmentation process is developed to partition a user's trip into GPS segments with only one transportation mode. A majority of studies have proposed mode inference models based
... on hand-crafted features, which might be vulnerable to traffic and environmental conditions. Furthermore, the classification task in almost all models have been performed in a supervised fashion while a large amount of unlabeled GPS trajectories has remained unused. Accordingly, a deep SEmi-Supervised Convolutional Autoencoder (SECA) architecture is proposed to not only automatically extract relevant features from GPS segments but also exploit useful information in unlabeled data. The SECA integrates a convolutional-deconvolutional autoencoder and a convolutional neural network into a unified framework to concurrently perform supervised and unsupervised learning. The two components are simultaneously trained using both labeled and unlabeled GPS segments, which have already been converted into an efficient representation for the convolutional operation. An optimum schedule for varying the balancing parameters between reconstruction and classification errors are also implemented. The performance of the proposed SECA model, trip segmentation, the method for converting a raw trajectory into a new representation, the hyperparameter schedule, and the model configuration are evaluated by comparing to several baselines and alternatives for various amounts of labeled and unlabeled data. The experimental results demonstrate the superiority of the proposed model over the state-of-the-art semi-supervised and supervised methods with respect to metrics such as accuracy and F-measure. (GENERAL AUDIENCE ABSTRACT) Identifying users' transportation modes (e.g., bike, bus, train, and car) is a key step towards many transportation related problems including (but not limited to) transport planning, transit demand analysis, auto ownership, and transportation emissions analysis. Traditionally, the information for analyzing travelers' behavior for choosing transport mode(s) was obtained through travel surveys. High cost, low-response rate, time-consuming manual data collection, and misreporting are the main demerits of the survey-based approaches. With the rapid growth of ubiquitous GPS-enabled devices (e.g., smartphones), a constant stream of users' trajectory data can be recorded. A user's GPS trajectory is a sequence of GPS points, recorded by means of a GPS-enabled device, in which a GPS point contains the information of the device geographic location at a particular moment. In this research, users' GPS trajectories, rather than traditional resources, are harnessed to predict their transportation mode by means of statistical models. With respect to the statistical models, a wide range of studies have developed travel mode detection models using on hand-designed attributes and classical learning techniques. Nonetheless, hand-crafted features cause some main shortcomings including vulnerability to traffic uncertainties and biased engineering justification in generating effective features. A potential solution to address these issues is by leveraging deep learning frameworks that are capable of capturing abstract features from the raw input in an automated fashion. Thus, in this thesis, deep learning architectures are exploited in order to identify transport modes based on only raw GPS tracks. It is worth noting that a significant portion of trajectories in GPS data might not be annotated by a transport mode and the acquisition of labeled data is a more expensive and labor-intensive task in comparison with collecting unlabeled data. Thus, utilizing the unlabeled GPS trajectory (i.e., the GPS trajectories that have not been annotated by a transport mode) is a cost-effective approach for improving the prediction quality of the travel mode detection model. Therefore, the unlabeled GPS data are also leveraged by developing a novel deep-learning architecture that is capable of extracting information from both labeled and unlabeled data. The experimental results demonstrate the superiority of the proposed models over the state-of-the-art methods in literature with respect to several performance metrics.