Finding K Most Significant Motifs in Big Time Series Data

Zaher Al Aghbari, Ayoub Al-Hamadi
2020 Procedia Computer Science  
An efficient discovery algorithm of frequently occurring patterns, called motifs, in a time series would be useful as a tool for summarizing and visualizing big time series databases. In this paper, we propose an efficient approximate algorithm, called DiscMotifs, to discover the K most significant (KMS) motifs from time series. First, the proposed algorithm transforms the time series into a SAX representation and then the algorithm divides the SAX representation into subsequences. Next, these
more » ... ubsequences are linearized by projecting them into a one-dimensional space based on their distances form a randomly selected reference point, or a subsequence. By utilizing the linear ordering of subsequences, DiscMotifs efficiently discovers the KMS motifs. DiscMotifs algorithm requires a storage space linear to the number of subsequences. We demonstrate the feasibility of this approach on several synthetic and real application datasets. Abstract An efficient discovery algorithm of frequently occurring patterns, called motifs, in a time series would be useful as a tool for summarizing and visualizing big time series databases. In this paper, we propose an efficient approximate algorithm, called DiscMotifs, to discover the K most significant (KMS) motifs from time series. First, the proposed algorithm transforms the time series into a SAX representation and then the algorithm divides the SAX representation into subsequences. Next, these subsequences are linearized by projecting them into a one-dimensional space based on their distances form a randomly selected reference point, or a subsequence. By utilizing the linear ordering of subsequences, DiscMotifs efficiently discovers the KMS motifs. DiscMotifs algorithm requires a storage space linear to the number of subsequences. We demonstrate the feasibility of this approach on several synthetic and real application datasets.
doi:10.1016/j.procs.2020.03.131 fatcat:ed6alkdsorck7nsgqkihwrebxa