4,291 Hits in 8.2 sec

Distance-based outlier queries in data streams: the novel task and algorithms

Fabrizio Angiulli, Fabio Fassetti
2010 Data mining and knowledge discovery  
This work proposes a method for detecting distance-based outliers in data streams under the sliding window model.  ...  The novel notion of one-time outlier query is introduced in order to detect anomalies in the current window at arbitrary points-in-time. Three algorithms are presented.  ...  The authors would like to thank the TAO Project Office for making available the collected measurements.  ... 
doi:10.1007/s10618-009-0159-9 fatcat:brnxdeb6uvddxasdae2benaiyi

Continuous outlier detection in data streams

Dimitrios Georgiadis, Maria Kontaki, Anastasios Gounaris, Apostolos N. Papadopoulos, Kostas Tsichlas, Yannis Manolopoulos
2013 Proceedings of the 2013 international conference on Management of data - SIGMOD '13  
Such outliers are referred to as distance-based outliers and are the focus of this work.  ...  a data stream.  ...  The data mining tasks currently supported by MOA include stream classification and clustering.  ... 
doi:10.1145/2463676.2463691 dblp:conf/sigmod/GeorgiadisKGPTM13 fatcat:y5ng3mo7tfcodnnytn7lr23xoa

Detecting distance-based outliers in streams of data

Fabrizio Angiulli, Fabio Fassetti
2007 Proceedings of the sixteenth ACM conference on Conference on information and knowledge management - CIKM '07  
In this work a method for detecting distance-based outliers in data streams is presented.  ...  We deal with the sliding window model, where outlier queries are performed in order to detect anomalies in the current window. Two algorithms are presented.  ...  The authors would like to thank the TAO Project Office for making available the collected measurements.  ... 
doi:10.1145/1321440.1321552 dblp:conf/cikm/AngiulliF07a fatcat:j7w7fedm5fe6hitfwtkbwl7r7a

Explainable Distance-Based Outlier Detection in Data Streams

Theodoros Toliopoulos, Anastasios Gounaris
2022 IEEE Access  
We extend this rationale for unsupervised distance-based outlier detection, and through investigating subspaces, we propose a novel labeling of outliers in a manner that is intuitive for the user and does  ...  Moreover, our solution is applicable to online settings and a complete prototype for detecting and explaining outliers in data streams using massive parallelism has been implemented.  ...  The proposed state-of-theart multi-query streaming distance-based outlier detection algorithms include AMCOD [16] , SOP [23] , pMCSky [24] and MDUAL [7] .  ... 
doi:10.1109/access.2022.3172345 fatcat:oazqjzmk7fbfjci7kahkibjyie

Mining and linking patterns across live data streams and stream archives

Di Yang, Kaiyu Zhao, Maryam Hasan, Hanyuan Lu, Elke Rundensteiner, Matthew Ward
2013 Proceedings of the VLDB Endowment  
We will demonstrate the visual analytics system V istream T , that supports interactive mining of complex patterns within and across live data streams and stream pattern archives.  ...  In our demonstration, we will illustrate that with V istream T , analysts can easily submit, monitor and interact with a broad range of query types for pattern mining.  ...  Both the incremental pattern representation and meta-query techniques can be applied to several complex pattern types, such as clusters, k nearest neighbors, and distance-based outliers [8] .  ... 
doi:10.14778/2536274.2536312 fatcat:ghbugf64kbgkla7pcelzgbw2nq

Continuous Outlier Mining of Streaming Data in Flink [article]

Theodoros Toliopoulos, Anastasios Gounaris, Kostas Tsichlas, Apostolos Papadopoulos, Sandra Sampaio
2019 arXiv   pre-print
In recent years, several solutions have tackled the problem of distance-based outliers in data streams, where outliers must be mined continuously as new elements become available.  ...  In this work, we focus on distance-based outliers in a metric space, where the status of an entity as to whether it is an outlier is based on the number of other entities in its neighborhood.  ...  In Section 2, we have already discussed algorithms for outlier detection in streams. The next most related area to our work is parallel algorithms for distance-based outlier detection.  ... 
arXiv:1902.07901v1 fatcat:w7pqkipvgvffndtcfhr6e4hawe

Efficient and flexible algorithms for monitoring distance-based outliers over data streams

Maria Kontaki, Anastasios Gounaris, Apostolos N. Papadopoulos, Kostas Tsichlas, Yannis Manolopoulos
2016 Information Systems  
In this work, we propose new algorithms for continuous outlier monitoring in data streams, based on sliding windows.  ...  The problem offers significant challenges when a stream-based environment is considered, where data arrive continuously and outliers must be detected on-the-fly.  ...  of multiple distance-based outlier detection tasks with different values of k and R.  ... 
doi:10.1016/ fatcat:n6c3nru4yffz7lhoikmnba5cty

Penalty Parameter Selection for Hierarchical Data Stream Clustering

Amol Bhagat, Nilesh Kshirsagar, Priti Khodke, Kiran Dongre, Sadique Ali
2016 Procedia Computer Science  
The approaches presented in this paper are helpful for the researchers in the field of data stream clustering and data mining.  ...  Identifying the number of clusters required for the precise clustering of data streams is an open research area. This paper gives the overview of the hierarchical data stream clustering algorithms.  ...  Offline data stream mining used in like generating report based on web log streams. Clustering data streams is commonly a difficult task.  ... 
doi:10.1016/j.procs.2016.03.005 fatcat:fgyfgl7rsnh5vbi7ggmbkka47e

Incremental Local Outlier Detection for Data Streams

Dragoljub Pokrajac, Aleksandar Lazarevic, Longin Jan Latecki
2007 2007 IEEE Symposium on Computational Intelligence and Data Mining  
in detecting outliers and changes of distributional behavior in various data stream applications.  ...  In this paper, an incremental LOF (Local Outlier Factor) algorithm, appropriate for detecting outliers in data streams, is proposed.  ...  In this paper, we propose a novel incremental LOF algorithm that is appropriate for detecting outliers in data streams.  ... 
doi:10.1109/cidm.2007.368917 dblp:conf/cidm/PokrajacLL07 fatcat:xjbkafbc6bhktew3xiacuwrosq

Designing a Streaming Algorithm for Outlier Detection in Data Mining—An Incrementa Approach

Kangqing Yu, Wei Shi, Nicola Santoro
2020 Sensors  
To design an algorithm for detecting outliers over streaming data has become an important task in many common applications, arising in areas such as fraud detections, network analysis, environment monitoring  ...  We also present another algorithm, C_LOF, based on a very popular and effective outlier detection algorithm called Local Outlier Factor (LOF) which unfortunately works only on batched data.  ...  The work was supported in part by NSERC under the Discovery grant program. Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.3390/s20051261 pmid:32110907 pmcid:PMC7085525 fatcat:7xppxj33afcs5etdyqvdvp75xu

Outlier Detection Methods and the Challenges for their Implementation with Streaming Data

Ankita Karale
2020 Journal of Mobile Multimedia  
The existing algorithms and techniques in this category are elaborated in detail and the advantages and shortcomings of these techniques are summarized.  ...  Data mining is the rule of dealing with big amounts of data and choosing the important. Outlier detection is data mining procedures that identify uncommon occasions and special cases.  ...  Though finding outlier in streaming data is more difficult task.  ... 
doi:10.13052/jmm1550-4646.1635 fatcat:fnrb55vouba43lszw7mmsrfgp4

An Effective Minimal Probing Approach with Micro-Cluster for Distance-based Outlier Detection in Data streams

Mohamed Jaward Bah, Hongzhi Wang, Mohamed Hammad, Furkh Zeshan, Hanan Aljuaid
2019 IEEE Access  
Outlier detection in data streams is considered a significant task in data mining that targets the discovery of elements in an unprecedented data arrival rate.  ...  INDEX TERMS Outlier detection, data streams, distance-based, micro-cluster.  ...  In summary, the following are the major contributions in this paper: 1) We propose a novel solution for the problem of detecting distance-based outliers in data streams by simultaneously applying the concept  ... 
doi:10.1109/access.2019.2946966 fatcat:h6laznxv3ze4lit7sg4bxvorh4

Mining Text Streams [chapter]

Charu C. Aggarwal
2012 Mining Text Data  
In this chapter, we review text stream mining algorithms for a wide variety of problems in data mining such as clustering, classification and topic modeling.  ...  The large amount of text data which are continuously produced over time in a variety of large scale applications such as social networks results in massive streams of data.  ...  In the event that the corresponding distance is above a given threshold, we can declare the underlying story as novel.  ... 
doi:10.1007/978-1-4614-3223-4_9 fatcat:x3q4nx36zzea7pn2y5w7xmli34

Overview Of Streaming-Data Algorithms

T Soni Madhulatha
2011 Advanced Computing An International Journal  
Clustering algorithms in general have been categorized into five types: partitioning, hierarchical, density-based, grid-based, and model-based.  ...  To avoid the problems with non-uniform sized or shaped clusters, CURE employs a novel hierarchical clustering algorithm that adopts a middle ground between the centroid based and all point extremes.  ...  CONCLUSION Most of the algorithms generally assume some implicit structure in the data set.  ... 
doi:10.5121/acij.2011.2614 fatcat:u77g7kwauvhodaedzlyyibzksy

Efficient anomaly monitoring over moving object trajectory streams

Yingyi Bu, Lei Chen, Ada Wai-Chee Fu, Dawei Liu
2009 Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '09  
In this paper, we present a novel framework for monitoring anomalies over continuous trajectory streams.  ...  First, we illustrate the importance of distance-based anomaly monitoring over moving object trajectories.  ...  streams, and propose data structures and algorithms employing local clustering and piecewise VP-tree based rescheduling to efficiently conduct such a task.  ... 
doi:10.1145/1557019.1557043 dblp:conf/kdd/BuCFL09 fatcat:zp4jfgxpujdwfof5psysqyfjd4
« Previous Showing results 1 — 15 out of 4,291 results