210 Hits in 7.7 sec

Addressing Big Data Time Series: Mining Trillions of Time Series Subsequences Under Dynamic Time Warping

Thanawin Rakthanmanon, Bilson Campana, Abdullah Mueen, Gustavo Batista, Brandon Westover, Qiang Zhu, Jesin Zakaria, Eamonn Keogh
2013 ACM Transactions on Knowledge Discovery from Data  
We demonstrate the following unintuitive fact: in large datasets we can exactly search under Dynamic Time Warping (DTW) much more quickly than the current state-of-the-art Euclidean distance search algorithms  ...  Most time series data mining algorithms use similarity search as a core subroutine, and thus the time taken for similarity search is the bottleneck for virtually all time series data mining algorithms,  ...  ACKNOWLEDGMENTS We thank all the donors of code and data. We thank the reviewers for their useful comments. Papapetrou et al. [2011] omits the length of the test data we mention in Section 1. T.  ... 
pmid:31607834 pmcid:PMC6790126 fatcat:tidmvwtzejgbrdvv5rghif5vyy

Data Mining Techniques using Time Series Research

2019 International journal of recent technology and engineering  
The similarity measure plays a primary role in time series data mining, which improves the accuracy of data mining task.  ...  Streaming of data is one of the difficult tasks that should be managed over time. Thus, this paper can provide a basic and prominent knowledge about time series in data mining research field.  ...  Rule Discovery, Classification, Clustering, Prediction and Pattern mining comes under mining of time series technique.  ... 
doi:10.35940/ijrte.b1020.0982s1119 fatcat:6l4l7o2j5fdd5l3zbngjou7yfu

Towards a Near Universal Time Series Data Mining Tool: Introducing the Matrix Profile [article]

Chin-Chia Michael Yeh
2020 arXiv   pre-print
Surprisingly, however, little progress has been made on addressing this problem for time series subsequences.  ...  By building time series data mining methods on top of matrix profile, many time series data mining tasks (e.g., motif discovery, discord discovery, shapelet discovery, semantic segmentation, and clustering  ...  one-nearest-neighbor with z-normalized euclidean distance, one-nearest-neighbor with z-normalized dynamic time warping with warping window r, and one-nearest-neighbor with z-normalized dynamic time warping  ... 
arXiv:1811.03064v2 fatcat:2p5o45bedjfyxei34kgrnliojm

Instruction set extensions for Dynamic Time Warping

Joseph Tarango, Eamonn Keogh, Philip Brisk
2013 2013 International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS)  
To address this concern, we introduce a specialized instruction set for time-series data mining applications to a 32-bit embedded processor, yielding a 4.87x performance improvement and a 78% reduction  ...  Although most sensor data is fixed-point, the normalization process-an absolute necessity for highly accurate similarity search of time-series data-converts the data to floating-point in order to avoid  ...  Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the NSF.  ... 
doi:10.1109/codes-isss.2013.6659005 dblp:conf/codes/TarangoKB13 fatcat:yn6dqgqiujcojhurrpaoc3jnfe

Mining Melodic Patterns in Large Audio Collections of Indian Art Music

Sankalp Gulati, Joan Serra, Vignesh Ishwar, Xavier Serra
2014 2014 Tenth International Conference on Signal-Image Technology and Internet-Based Systems  
We compute similarity between melodic patterns using dynamic time warping (DTW).  ...  We present a data-driven approach for the discovery of shorttime melodic patterns in large collections of Indian art music.  ...  Over 13 trillion distance computations are done in this task. To the best of our knowledge, this is the first time melodic patterns are mined from such a large volume of audio data.  ... 
doi:10.1109/sitis.2014.73 dblp:conf/sitis/GulatiSIS14 fatcat:sxuxbdsi7vbalhwvtujjkjatqu

DTW-MIC Coexpression Networks from Time-Course Data [article]

Samantha Riccadonna and Giuseppe Jurman and Roberto Visintainer and Michele Filosi and Cesare Furlanello
2014 arXiv   pre-print
Here we propose to overcome these two issues by employing a novel similarity function, Dynamic Time Warping Maximal Information Coefficient (DTW-MIC), combining a measure taking care of functional interactions  ...  When modeling coexpression networks from high-throughput time course data, Pearson Correlation Coefficient (PCC) is one of the most effective and popular similarity functions.  ...  Rakthanmanon T, Campana B, Mueen A, Westover G, Zhu Q, et al. (2012) Searching and mining trillions of time series subsequences under Dynamic Time Warping.  ... 
arXiv:1210.3149v2 fatcat:m2hurjqkqrgmzfirqqlh6lvf7a

Fast and Exact Monitoring of Co-Evolving Data Streams

Yasuko Matsubara, Yasushi Sakurai, Naonori Ueda, Masatoshi Yoshikawa
2014 2014 IEEE International Conference on Data Mining  
Our experiments on 67GB of real data illustrate that StreamScan does indeed detect the qualifying subsequence patterns correctly and that it can offer great improvements in speed (up to 479,000 times)  ...  Our aim is to monitor data streams statistically, and find subsequences that have the characteristics of a given hidden Markov model (HMM).  ...  SPRING [29] is an efficient algorithm for monitoring multiple numerical streams under the dynamic time warping (DTW) distance.  ... 
doi:10.1109/icdm.2014.62 dblp:conf/icdm/MatsubaraSUY14 fatcat:pigqdwv7frgmboc4wolyaurpf4

Time series data mining methods [article]

Caroline Kleist, Humboldt-Universität Zu Berlin, Humboldt-Universität Zu Berlin, Felix Jung
This review gives an overview of the challenges of large time series and the proposed problem solving approaches from time series data mining community.  ...  Today, real world time series data sets can take a size up to a trillion observations and even more. Data miners' task is it to detect new information that is hidden in this massive amount of data.  ...  Rakthanmanon et al. (2012) were the first to develop a similarity search algorithm based on dynamic time warping (see Section 3.5.1) that allows mining a trillion time series objects.  ... 
doi:10.18452/14237 fatcat:sfi72ccksvhrrmltk4mb24teau

Online Multi-horizon Transaction Metric Estimation with Multi-modal Learning in Payment Networks [article]

Chin-Chia Michael Yeh, Zhongfang Zhuang, Junpeng Wang, Yan Zheng, Javid Ebrahimi, Ryan Mercer, Liang Wang, Wei Zhang
2021 arXiv   pre-print
We also propose a hybrid offline/online training scheme to address concept drift in the data and fulfill the real-time requirements.  ...  However, new domain-related challenges associated with the data such as concept drift and multi-modality have surfaced in addition to the real-time requirements of handling the payment transaction data  ...  and mining trillions of time series subsequences under dynamic time warping. Time series analysis: forecasting and control. John Wiley & Sons.  ... 
arXiv:2109.10020v1 fatcat:gkqvm3py6naefihk5oy6hp75sa

Unsupervised classification of variable stars

Lucas Valenzuela, Karim Pichara
2017 Monthly notices of the Royal Astronomical Society  
Every time data comes from new surveys; the only available training instances are the ones that have a cross-match with previously labelled objects, consequently generating insufficient training sets compared  ...  We also develop a fast similarity function specific for light curves, based on a novel data structure that allows scaling the search over the entire dataset of unlabelled objects.  ...  This paper utilizes public domain data obtained by the MACHO Project, jointly funded by the US Department of Energy through the University of California, Lawrence Livermore National Laboratory under contract  ... 
doi:10.1093/mnras/stx2913 fatcat:ob2qd7kqznfsho6qebm5242eey

A Survey on Trajectory Big Data Processing

Amina Belhassena
2018 International Journal of Performability Engineering  
Furthermore, this paper reviews an extensive collection of existing applications of movement objects, including trajectory data mining and frequent trajectory.  ...  Further, a wide spectrum of application domains can benefit from trajectory data mining including trajectory organization as well as queries.  ...  Heilongjiang Providence LC2016026 and MOE-Microsoft Key Laboratory of Natural Language Processing and Speech, Harbin Institute of Technology.  ... 
doi:10.23940/ijpe.18.02.p13.320333 fatcat:m74w3cfajrbzpamzpghfyrm6am

Time Series Classification at Scale

This thesis develops scalable algorithms and techniques to classify large amount of time series data. Nowadays, many real-world applications are generating huge amount of time series data.  ...  Unfortunately, the state-of-the-art classification algorithms are impractical for large amount of time series data. Models that are accurate but slow are not good.  ...  [32] introduce the UCR SUITE algorithm to search trillions of time series subsequences under DTW using all these optimisation techniques.  ... 
doi:10.26180/5c9f12307f11b fatcat:brsbtamwbnfujestu7ugyrpdcy

Fast Popularity Value Calculation of Virtual Cryptocurrency Trading Stage Based on Machine Learning

Tong Zhu, Tong Zhu, Chenyang Liao, Ziyang Zhou, Xinyu Li, Qingfu Zhang
2022 Frontiers in Physics  
As of August 1, 2021, more than 11,570 virtual cryptocurrencies have been publicly issued and traded globally, with a total value of over $1.68 trillion.  ...  Virtual cryptocurrency is one of the application directions of blockchain technology.  ...  DTW (Dynamic Time Warping) [46] [47] [48] [49] [50] solves this problem. DTW is actually calculating Euclidean distance.  ... 
doi:10.3389/fphy.2021.788508 doaj:8db3eb9af2ba4887b4b58f0676a5a5de fatcat:ahhcpvzkzbagvbplecpgv7ggl4

Locality-sensitive hashing for earthquake detection

Kexin Rong, Clara E. Yoon, Karianne J. Bergen, Hashem Elezabi, Peter Bailis, Philip Levis, Gregory C. Beroza
2018 Proceedings of the VLDB Endowment  
However, a straightforward implementation of this LSH-enabled application has difficulty scaling beyond 3 months of continuous time series data measured at a single seismic station.  ...  We describe several end-toend optimizations of the analysis pipeline from pre-processing to post-processing, which allow the application to scale to time series data measured at multiple seismic stations  ...  under grant EAR-1818579 and CAREER grant CNS-1651570.  ... 
doi:10.14778/3236187.3236214 fatcat:26lez7nlj5cc5cwulxwggk5lwm

Case Study for the Return on Investment of Internet of Things Using Agent-Based Modelling and Data Science

Charles Houston, Stephen Gooberman-Hill, Richard Mathie, Andrew Kennedy, Yunxi Li, Pedro Baiz
2017 Systems  
A specific case study is addressed to assess the return on investment of installing condition monitoring sensors on lift assets in a London Underground station.  ...  Traditional simulation and analysis techniques cannot model the complex systems inherent in fields such as infrastructure asset management, or suffer from a lack of data on which to build a prediction.  ...  Author Contributions: C.H. wrote most of the report and performed simulations and analysis; S.G.H., R.M., A.K. and Y.L. assisted with expertise, data acquisition and analysis as well as simulation design  ... 
doi:10.3390/systems5010004 fatcat:q47ladtthva7rkiw6t4qycrdo4
« Previous Showing results 1 — 15 out of 210 results