DCF: An Efficient Data Stream Clustering Framework for Streaming Applications [chapter]

Kyungmin Cho, Sungjae Jo, Hyukjae Jang, Su Myeon Kim, Junehwa Song
2006 Lecture Notes in Computer Science  
Streaming applications, such as environment monitoring and vehicle location tracking require handling high volumes of continuously arriving data and sudden fluctuations in these volumes while efficiently supporting multidimensional historical queries. The use of the traditional database management systems is inappropriate because they require excessive number of disk I/O in continuously updating massive data streams. In this paper, we propose DCF (Data Stream Clustering Framework), a novel
more » ... work that supports efficient data stream archiving for streaming applications. DCF can reduce a great amount of disk I/O in the storage system by grouping incoming data into clusters and storing them instead of raw data elements. In addition, even when there is a temporary fluctuation in the amount of incoming data, it can stably support storing all incoming raw data by controlling the cluster size. Our experimental results show that our approach significantly reduces the number of disk accesses in terms of both inserting and retrieving data.
doi:10.1007/11827405_12 fatcat:h5krqd56kbcwbm2sgc3jvvcn4m