Similar sequence matching supporting variable-length and variable-tolerance continuous queries on time-series data stream

Hyo-Sang Lim, Kyu-Young Whang, Yang-Sae Moon
<span title="">2008</span> <i title="Elsevier BV"> <a target="_blank" rel="noopener" href="" style="color: black;">Information Sciences</a> </i> &nbsp;
We propose a new similar sequence matching method that efficiently supports variable-length and variable-tolerance continuous query sequences on time-series data stream. Earlier methods do not support variable lengths or variable tolerances adequately for continuous query sequences if there are too many query sequences registered to handle in main memory. To support variable-length query sequences, we use the window construction mechanism that divides long sequences into smaller windows for
xing and searching the sequences. To support variable-tolerance query sequences, we present a new notion of intervaled sequences whose individual entries are an interval of real numbers rather than a real number itself. We also propose a new similar sequence matching method based on these notions, and then, formally prove correctness of the method. In addition, we show that our method has the prematching characteristic, which finds future candidates of similar sequences in advance. Experimental results show that our method outperforms the naive one by 2.6-102.1 times and the existing methods in the literature by 1.4-9.8 times over the entire ranges of parameters tested when the query selectivities are low (<32%), which are practically useful in large database applications.
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="">doi:10.1016/j.ins.2007.10.026</a> <a target="_blank" rel="external noopener" href="">fatcat:w7tjk5kv7bbqhei4nwfvt2jwdm</a> </span>
