Data Stream Clustering Algorithms: A Review
release_mo4xwgckf5en3k5fqkrhu5zc2q
by
Maryam Mousavi,
Azuraliza Abu Bakar,
Mohammadmahdi Vakilian
2015
Abstract
Data stream mining has become a research area of some interest in recent years. The key challenge in data stream mining is extracting valuable knowledge in real time from a massive, continuous, dynamic data stream in only a single scan. Clustering is an efficient tool to overcome this problem. Data stream clustering can be applied in various fields such as financial transactions, telephone records, sensor network monitoring, telecommunications, website analysis, weather monitoring, and e-business. Data stream clustering presents some challenges; it needs to be done in a short time frame with limited memory using a single-scan process. Moreover, because data stream outliers are hidden, clustering algorithms must be able to detect outliers and noise. In addition, the algorithms have to handle concept drift and detect arbitrary shaped clusters. Several algorithms have been proposed to overcome these challenges. This paper presents a review of five types of data stream clustering approaches: partitioning, hierarchical, density-based, grid-based and model-based. The different data stream clustering algorithms in the literature by considering their respective advantages and disadvantages are discussed.
In text/plain
format
Archived Files and Locations
application/pdf
113.6 kB
file_2jf6p6qfrzerrbw4ag2plfk7ta
|
web.archive.org (webarchive) home.ijasca.com (web) |
article-journal
Stage
unknown
Year 2015
access all versions, variants, and formats of this works (eg, pre-prints)