Classification of multivariate time series via temporal abstraction and time intervals mining
Knowledge and Information Systems
Classification of multivariate time series data, often including both time points and intervals at variable frequencies, is a challenging task. We introduce a framework for classification of multivariate time series analysis, which implements three phases: (1) application of a temporal-abstraction process that transforms a series of raw time-stamped data points into a series of symbolic time intervals; (2) mining these intervals to discover frequent temporal patterns, using Allen's 13 temporal
... elations; (3) using the patterns as features to induce a classifier. Researchers can use different sets of temporal relations, and can vary an epsilon factor that represents flexibility in the nature of the temporal relations. We evaluated the framework on datasets in the domains of diabetes, intensive care, and infectious hepatitis, assessing the effects of the various settings of the KLS framework. Discretization using SAX led to better performance than using the Equal-Width method; knowledgebased abstraction, when available, was superior to both. Using three abstract temporal relations was superior to using the seven core temporal relations. Using an epsilon larger than zero tended to result in a slightly better accuracy when using SAX, but in a reduced accuracy when using EWD, and does not seem indicated. No feature selection method we tried proved useful. Regarding feature (pattern) representation, Mean Duration performed better than Horizontal Support (within the same entity), which performed better than the Binary (existence) representation method. Random Forest outperformed other induction methods we tried.