Filters








4,985 Hits in 6.1 sec

Real-time approximate Range Motif discovery & data redundancy removal algorithm

Ankur Narang, Souvik Bhattcherjee
2011 Proceedings of the 14th International Conference on Extending Database Technology - EDBT/ICDT '11  
To the best of our knowledge, this is the highest real-time throughput for approximate Range Motif discovery and data redundancy removal on such massive datasets.  ...  Real-time scalable approximate Range Motif discovery on massive datasets is a challenging problem.  ...  We have presented novel sequential and parallel algorithms for real-time approximate Range Motif discovery and data redundancy removal.  ... 
doi:10.1145/1951365.1951422 dblp:conf/edbt/NarangB11 fatcat:ticx3mddvbgulgtpie4du2xsfq

Salient Segmentation of Medical Time Series Signals

Jonathan Woodbridge, Mars Lan, Majid Sarrafzadeh, Alex Bui
2011 2011 IEEE First International Conference on Healthcare Informatics, Imaging and Systems Biology  
Furthermore, salient segmentation can reduce redundancy in motif discovery algorithms by more than 85%, yielding a more succinct representation of a time series signal.  ...  The resulting database index contains an abundance of redundant time series segments with little to no alignment. This paper presents the idea of "salient segmentation".  ...  Acknowledgments The authors would like to thank Professor Eamonn Keogh for providing us with their implementation of the motif discovery algorithm presented in [9] .  ... 
doi:10.1109/hisb.2011.41 pmid:27617296 pmcid:PMC5015443 dblp:conf/hisb/WoodbridgeLSB11 fatcat:6dbfjagm6jdglcdvhjfmlpfr7i

Discovering Subdimensional Motifs of Different Lengths in Large-Scale Multivariate Time Series [article]

Yifeng Gao, Jessica Lin
2019 arXiv   pre-print
On the other hand, previous work show that index-based fixed-length approximate motif discovery algorithms such as random projection are not suitable for detecting variable-length motifs due to memory  ...  In this paper, we introduce an approximate variable-length subdimensional motif discovery algorithm called Collaborative HIerarchy based Motif Enumeration (CHIME) to efficiently detect variable-length  ...  Subdimensional motifs of different lengths widely exist in real world time series data.  ... 
arXiv:1911.09218v1 fatcat:mimubzgpafchtf3p5y6hc7lr74

Efficient Discovery of Variable-length Time Series Motifs with Large Length Range in Million Scale Time Series [article]

Yifeng Gao, Jessica Lin
2018 arXiv   pre-print
In this work, we introduce an approximate algorithm called HierarchIcal based Motif Enumeration (HIME) to detect variable-length motifs with a large enumeration range in million-scale time series.  ...  Moreover, the motif length range detected by HIME is considerably larger than previous sequence-matching based approximate variable-length motif discovery approach.  ...  large range of motif discovery, HIME can detect long and rare patterns in real world time series.  ... 
arXiv:1802.04883v1 fatcat:lz2bqvv7i5hlzezskqfoixbkxy

An Entropy-Based Position Projection Algorithm for Motif Discovery

Yipu Zhang, Ping Wang, Maode Yan
2016 BioMed Research International  
The experimental results on real DNA sequences, Tompa data, and ChIP-seq data show that EPP is advantageous in dealing with the motif discovery problem and outperforms current widely used algorithms.  ...  However, the most existing motif discovery algorithms are still time-consuming or easily trapped in a local optimum.  ...  Experimental results on real DNA sequences, Tompa data, and ChIP-seq data demonstrate that EPP is advantageous to deal with the motif discovery problem and outperforms current widely used approximate algorithms  ... 
doi:10.1155/2016/9127474 pmid:27882329 pmcid:PMC5110948 fatcat:7mcpbqmmp5g5tcxdon7d2ed2ga

A Selection-based MCL Clustering Algorithm for Motif Discovery

Chunxiao Sun, Zhiyong Zhang, Jinglei Tang, Shuai Liu
2018 American Journal of Biochemistry and Biotechnology  
The experiment resulted on simulation data shows that SMCLR algorithm has higher prediction accuracy in a reasonable time than these existing motif discovery algorithms like Project, MEME, MCL-WMR and  ...  Moreover, the experiment resulted on real biological data demonstrates the effectiveness of SMCLR algorithm.  ...  These are widely used in the current motif discovery algorithms to identify the real motifs.  ... 
doi:10.3844/ajbbsp.2018.298.306 fatcat:k2yrdmpukne3fnh43q7nzoiqz4

Unsupervised discovery of basic human actions from activity recording datasets

Yasser Mohammad, Toyoaki Nishida
2012 2012 IEEE/SICE International Symposium on System Integration (SII)  
The proposed system was evaluated on real records of full body motions and is shown in this paper to achieve high accuracy compared with a recently proposed motif discovery algorithm applied to the same  ...  This paper proposes the utilization of a novel motif discovery algorithm based on the exact MK algorithm to discover basic actions in activity records.  ...  This step just removes redundant sets. After that all sets marked for removal are deleted.  ... 
doi:10.1109/sii.2012.6426960 fatcat:4qcrv6tqqnehplkqf3ubblhqsq

Admissible Time Series Motif Discovery with Missing Data [article]

Yan Zhu, Abdullah Mueen, Eamonn Keogh
2018 arXiv   pre-print
The discovery of time series motifs has emerged as one of the most useful primitives in time series data mining.  ...  Although there has been more than a decade of extensive research, there is still no technique to allow the discovery of time series motifs in the presence of missing data, despite the well-documented ubiquity  ...  INTRODUCTION Time series motifs are short approximately repeated patterns within a longer time series dataset.  ... 
arXiv:1802.05472v1 fatcat:n377j7acpbasndwl5zsfpkckui

Mining Historical Documents for Near-Duplicate Figures

Thanawin Rakthanmanon, Qiang Zhu, Eamonn J. Keogh
2011 2011 IEEE 11th International Conference on Data Mining  
Most of the data in historical manuscripts is text, but there is also a significant fraction devoted to images.  ...  To this end, we introduce an efficient and scalable system which can detect approximately repeated occurrences of shape patterns both within and between historical texts.  ...  Scalability and Noise Tolerance Testing the scalability of our approach on real data provides us with significant challenges, since the running time of our algorithm depends on the data.  ... 
doi:10.1109/icdm.2011.102 dblp:conf/icdm/RakthanmanonZK11 fatcat:wkwzoiw2rfexzo742jjzulusda

Finding Surprisingly Frequent Patterns of Variable Lengths in Sequence Data

Reza Sadoddin, Joerg Sander, Davood Rafiei
2016 Proceedings of the 2016 SIAM International Conference on Data Mining  
Motif Discovery in Time Series Data Another line of research related to our work is 'motif' discovery in time series data.  ...  Using the DUSC model, Assent et al. propose a new algorithm (INSCY : indexing subspace clusters with in-process-removal of redundancy) for removing repeated (redundant) clusters.  ...  Appendices Appendix A List of single best motifs found by CPS on motif discovery benchmark No motifs found NaN yst06g CCT AAT T GG|2 0.459297 yst07m No motifs found 1.000000 No motifs found 1.00000 hm01g  ... 
doi:10.1137/1.9781611974348.4 dblp:conf/sdm/SadoddinSR16 fatcat:yerjh5tny5a2rpezrhaqah4vem

Efficiently Finding Near Duplicate Figures in Archives of Historical Documents

Thanawin Rakthanmanon, Qiang Zhu, Eamonn J. Keogh
2012 Journal of Multimedia  
Most of the data in historical manuscripts is text, but there is also a significant fraction devoted to images.  ...  To this end, we introduce an efficient and scalable system that can detect approximately repeated occurrences of shape patterns both within and between historical texts.  ...  Scalability and Noise Tolerance Testing the scalability of our approach on real data provides us with significant challenges, since the running time of our algorithm depends on the data.  ... 
doi:10.4304/jmm.7.2.109-123 fatcat:s5ppjnprhfdadenh2jiwovyjrm

Discriminative pattern mining and its applications in bioinformatics

Xiaoqing Liu, Jun Wu, Feiyang Gu, Jie Wang, Zengyou He
2014 Briefings in Bioinformatics  
The archetypical applications in bioinformatics include phosphorylation motif discovery, differentially expressed gene identification, discriminative genotype pattern detection, etc.  ...  Research on finding interesting discriminative patterns in class-labeled data evolves rapidly and lots of algorithms have been proposed to specifically address this problem.  ...  The problem of phosphorylation motif discovery has been widely explored and several effective algorithms have been proposed based on data mining techniques.  ... 
doi:10.1093/bib/bbu042 pmid:25433466 fatcat:sdladkbllzdjdfsmejtc227kom

Time Series Motifs Statistical Significance [chapter]

Nuno Castro, Paulo J. Azevedo
2011 Proceedings of the 2011 SIAM International Conference on Data Mining  
Time series motif discovery is the task of extracting previously unknown recurrent patterns from time series data. It is an important problem within applications that range from finance to health.  ...  Many algorithms have been proposed for the task of efficiently finding motifs. Surprisingly, most of these proposals do not focus on how to evaluate the discovered motifs.  ...  There is a plethora of time series motif discovery algorithms in the literature (see section 2).  ... 
doi:10.1137/1.9781611972818.59 dblp:conf/sdm/CastroA11 fatcat:l6bgcxtmi5bqfo2sqfxbu4mrfm

Online Discovery of Top-k Similar Motifs in Time Series Data [chapter]

Hoang Thanh Lam, Ninh Dang Pham, Toon Calders
2011 Proceedings of the 2011 SIAM International Conference on Data Mining  
A motif is a pair of non-overlapping sequences with very similar shapes in a time series. We study the online topk most similar motif discovery problem.  ...  We demonstrate our results by both theoretical analysis and extensive experiments with both synthetic and real-life data.  ...  Based on discrete representations of real-valued data, these methods have introduced some levels of approximation in motif discovery [9] .  ... 
doi:10.1137/1.9781611972818.86 dblp:conf/sdm/LamCP11 fatcat:5xxoszjqfndjxfvmov6deqvu4u

Exact Discovery of Time Series Motifs [chapter]

Abdullah Mueen, Eamonn Keogh, Qiang Zhu, Sydney Cash, Brandon Westover
2009 Proceedings of the 2009 SIAM International Conference on Data Mining  
Because the obvious algorithm for computing motifs is quadratic in the number of items, more than a dozen approximate algorithms to discover motifs have been proposed in the literature.  ...  In this work, for the first time, we show a tractable exact algorithm to find time series motifs.  ...  Thus the algorithm is approximate for real-valued time series.  ... 
doi:10.1137/1.9781611972795.41 pmid:31656693 pmcid:PMC6814436 dblp:conf/sdm/MueenKZCW09 fatcat:vif6qlq2m5c6bebz7mm5xssxhm
« Previous Showing results 1 — 15 out of 4,985 results