Filters








77,499 Hits in 4.4 sec

Density-based Projected Clustering over High Dimensional Data Streams [chapter]

Irene Ntoutsi, Arthur Zimek, Themis Palpanas, Peer Kröger, Hans-Peter Kriegel
2012 Proceedings of the 2012 SIAM International Conference on Data Mining  
In this work, we propose a new density-based projected clustering algorithm, HDDStream, for high dimensional data streams.  ...  There exist methods for clustering over full dimensional streams and methods for nding clusters in subspaces of high dimensional static data.  ...  history of the data is not feasible. 2.2 Clustering high dimensional data Clustering high dimensional data has found a lot of attention, though mostly focused on static data so far.  ... 
doi:10.1137/1.9781611972825.85 dblp:conf/sdm/NtoutsiZPKK12 fatcat:3tvykbkkrzbtdjabpiichrlk3e

A Novel High Dimensional and High Speed Data Streams Algorithm: HSDStream

Irshad Ahmed, Irfan Ahmed, Waseem Shahzad
2016 International Journal of Advanced Computer Science and Applications  
High dimensional stream data is inherently more complex when used for clustering because the evolving nature of the stream data and high dimensionality make it non-trivial.  ...  This paper presents a novel high speed clustering scheme for high-dimensional data stream.  ...  In projected clustering high dimensional stream data is partitioned based on the preferred dimensions instead of full the dimensional space. Cao et al.  ... 
doi:10.14569/ijacsa.2016.070952 fatcat:jz2evpcm2ncbhl5n4uobp2zzem

Scaling up for high dimensional and high speed data streams: HSDStream [article]

Irshad Ahmed, Irfan Ahmed, Waseem Shahzad
2015 arXiv   pre-print
High dimensional stream data is inherently more complex when used for clustering because the evolving nature of the stream data and high dimensionality make it non-trivial.  ...  This paper presents a novel high speed clustering scheme for high dimensional data streams.  ...  In projected clustering high dimensional stream data has been partitioned based on preferred dimensions instead of full dimensional space. Cao et al.  ... 
arXiv:1510.03375v1 fatcat:s3ewalkrajfmtdug4dlfpwsgci

A Framework for Projected Clustering of High Dimensional Data Streams [chapter]

C AGGARWAL, J HAN, J WANG, P YU
2004 Proceedings 2004 VLDB Conference  
Recent research discusses methods for projected clustering over high-dimensional data sets.  ...  However, a lot of stream data is highdimensional in nature. High-dimensional data is inherently more complex in clustering, classification, and similarity search.  ...  HPStream introduces the concept of projected clustering to data streams. Since a lot of stream data is high-dimensional in nature, it is necessary to perform high quality high-dimensional clustering.  ... 
doi:10.1016/b978-012088469-8/50075-9 fatcat:alrskrirvbegrnc7oadcxvg4lq

A Framework for Projected Clustering of High Dimensional Data Streams [chapter]

Charu C. Aggarwal, Jiawei Han, Jianyong Wang, Philip S. Yu
2004 Proceedings 2004 VLDB Conference  
Recent research discusses methods for projected clustering over high-dimensional data sets.  ...  However, a lot of stream data is highdimensional in nature. High-dimensional data is inherently more complex in clustering, classification, and similarity search.  ...  HPStream introduces the concept of projected clustering to data streams. Since a lot of stream data is high-dimensional in nature, it is necessary to perform high quality high-dimensional clustering.  ... 
doi:10.1016/b978-012088469-8.50075-9 dblp:conf/vldb/AggarwalHWY04 fatcat:wxjqt6jhebharaum73wfz7tfmm

Constraint-based discriminative dimension selection for high-dimensional stream clustering

Kitsana Waiyamai, Thanapat Kangkachit
2018 IJAIN (International Journal of Advances in Intelligent Informatics)  
Clustering data streams is one of active research topic in data mining.  ...  SED-Stream is an efficient clustering algorithm that supports high dimension data streams.  ...  The use of dimension projection technique [2] , [11] - [17] is one possible solution to reduce complexity in dealing high dimensional streams.  ... 
doi:10.26555/ijain.v4i3.271 fatcat:hz5t2fjznzcerpahwh3mxfphsa

Detecting Projected Outliers in High-Dimensional Data Streams [chapter]

Ji Zhang, Qigang Gao, Hai Wang, Qing Liu, Kai Xu
2009 Lecture Notes in Computer Science  
In this paper, we study the problem of projected outlier detection in high dimensional data streams and propose a new technique, called Stream Projected Ouliter deTector (SPOT), to identify outliers embedded  ...  The experimental results demonstrate the efficiency and effectiveness of SPOT in detecting outliers in high-dimensional data streams.  ...  The problem of detecting projected outliers from high-dimensional data streams can be formulated as follows: given a ϕ-dimensional data stream D, for each data point p i = {p i1 , p i2 , . . . , p iϕ }  ... 
doi:10.1007/978-3-642-03573-9_53 fatcat:jqg73cdn35e7rj7wiwqtealbrm

Divisive clustering of high dimensional data streams

David P. Hofmeyr, Nicos G. Pavlidis, Idris A. Eckley
2015 Statistics and computing  
We propose a fully incremental projected divisive clustering method for high-dimensional data streams that is motivated by high density clustering.  ...  Clustering streaming data is gaining importance as automatic data acquisition technologies are deployed in diverse applications.  ...  Expanding on this we propose a framework for streaming data clustering, which we refer to as High-dimensional Streaming Divisive Clustering (HSDC), that is able to: (i) identify clusters of arbitrary orientation  ... 
doi:10.1007/s11222-015-9597-y fatcat:4cyk7tp3bzevjngzpr76zdx4fy

SPOT: A System for Detecting Projected Outliers From High-dimensional Data Streams

Ji Zhang, Qigang Gao, Hai Wang
2008 2008 IEEE 24th International Conference on Data Engineering  
In this paper, we present a new technique, called Stream Projected Ouliter deTector (SPOT), to deal with outlier detection problem in high-dimensional data streams.  ...  This paper provides details on the motivation and technical challenges of detecting outliers from high-dimensional data streams, present an overview of SPOT, and give the plans for system demonstration  ...  ACKNOWLEDGMENT The research and development of SPOT are supported in part by grant of Natural Sciences and Engineering Research Council of Canada (Grant #:312423) and Killam Foundation.  ... 
doi:10.1109/icde.2008.4497638 dblp:conf/icde/ZhangGW08 fatcat:xzqxygwzczgz7kswqios2xucxm

Adaptive non-linear clustering in data streams

Ankur Jain, Zhihua Zhang, Edward Y. Chang
2006 Proceedings of the 15th ACM international conference on Information and knowledge management - CIKM '06  
Tier-2 exploits this segment structure to continuously project the streaming data non-linearly onto a low-dimensional space (LDS), before assigning them to a cluster.  ...  Due to the evolving nature, and one-pass restriction imposed by the data stream model, traditional clustering algorithms are inapplicable for stream clustering.  ...  Subspace clustering approaches have also been proposed for clustering high dimensional data [6, 1] .  ... 
doi:10.1145/1183614.1183636 dblp:conf/cikm/JainZC06 fatcat:dtwbk4itkzc7fht7go65wvyeui

An adaptive and dynamic dimensionality reduction method for high-dimensional indexing

Heng Tao Shen, Xiaofang Zhou, Aoying Zhou
2005 The VLDB journal  
One well-known approach to overcome degradation in performance with respect to increasing dimensions is to reduce the dimensionality of the original dataset before constructing the index.  ...  Second, data points in the different axis systems are indexed using a single B + -tree. Third, our technique is highly scalable in terms of data size and dimension.  ...  We also thank those anonymous reviewers for their valuable comments in improving the quality of our paper.  ... 
doi:10.1007/s00778-005-0167-3 fatcat:gf5ihi7it5a3vehdsl2zqmz4u4

A comprehensive survey of anomaly detection techniques for high dimensional big data

Srikanth Thudumu, Philip Branch, Jiong Jin, Jugdutt (Jack) Singh
2020 Journal of Big Data  
Acknowledgements The article processing charge is funded by Swinburne University of Technology, Australia.  ...  Strategies for tackling the problem of high dimensionality One way to address the problem of high dimensionality is to reduce the dimensionality which projects the whole data set into a lower dimensional  ...  Clusters are generally embedded in the subspaces of high-dimensional data.  ... 
doi:10.1186/s40537-020-00320-x fatcat:nrx7fnuzbvf65edoisv65by4s4

A Survey on Density based Micro-clustering Algorithms for Data Stream Clustering

Donia Augustine
2017 International Journal of Advanced Research in Computer Science and Software Engineering  
However, the process of data stream clustering has been the subject of much attention due to its effectiveness in data mining.  ...  This paper presents a review of such algorithms which clusters the data stream using the density estimate.  ...  HDDStream Algorithm This algorithm is for clustering high dimensional data streams.  ... 
doi:10.23956/ijarcsse/v7i1/0111 fatcat:5gnhpr4kyvbezkptwkr6otvq6i

An incremental data-stream sketch using sparse random projections [chapter]

Aditya Krishna Menon, Gia Vinh Anh Pham, Sanjay Chawla, Anastasios Viglas
2007 Proceedings of the 2007 SIAM International Conference on Data Mining  
We propose the use of random projections with a sparse matrix to maintain a sketch of a collection of high-dimensional data-streams that are updated asynchronously.  ...  We verify the validity of this sketch by applying it to an online clustering problem, where we compare our results to the offline algorithm and an existing L2 sketch, and observe comparable results in  ...  We verify the validity of our projection-based sketch by applying it to an online clustering problem, where we have to cluster high-dimensional data streams that are updated incrementally at some given  ... 
doi:10.1137/1.9781611972771.62 dblp:conf/sdm/MenonPCV07 fatcat:p7ibk4qx5fgnblgyimcreicvue

Clustering Big Data streams: recent challenges and contributions

Marwan Hassani, Thomas Seidl
2016 it - Information Technology  
Today'sSince the growth of dataIn this article, novel methods for an efficient subspace clustering of high-dimensional big data streams are presented.  ...  Additionally, efficient and adaptive density-based clustering algorithms are presented for high-dimensional data streams.  ...  An efficient projected stream clustering algorithm is introduced for handling high-dimensional, noisy, evolving data streams is presented in PreDeConStream [16] .  ... 
doi:10.1515/itit-2016-0007 fatcat:jmnybg4vhjd3xn63wjw6vmo63m
« Previous Showing results 1 — 15 out of 77,499 results