PU-Shapelets: Towards Pattern-Based Positive Unlabeled Classification of Time Series [chapter]

Shen Liang, Yanchun Zhang, Jiangang Ma
2019 Complexity in Polish Phonotactics  
Real-world time series classification applications often involve positive unlabeled (PU) training data, where there are only a small set P L of positive labeled examples and a large set U of unlabeled ones. Most existing time series PU classification methods utilize all readings in the time series, makeing them sensitive to non-characteristic readings. Characteristic patterns named shapelets present a promising solution to this problem, yet discovering shapelets under PU settings is not easy.
more » ... ings is not easy. In this paper, we take on the challenging task of shapelet discovery with PU data. We propose a novel pattern ensemble technique utilizing both characteristic and non-characteristic patterns to rank U examples by their possibility of being positive. We also present a novel stopping criterion to estimate the number of positive examples in U . These enable us to effectively label all U training examples and conduct supervised shapelet discovery. The shapelets are then applied to online classification. Extensive experiments demonstrate the effectiveness of our method. This work is funded by NSFC grants 61672161 and 61332013. We sincerely thank Dr Nurjahan Begum and Dr Anthony Bagnall for granting us access to the code of [3] and [7] , all data contributors of [5] , and all our colleagues who have contributed their valuable suggestions to this work.
doi:10.1007/978-3-030-18576-3_6 dblp:conf/dasfaa/LiangZM19 fatcat:ctzvxlfrlfa5jf7bmkzm3skmdu