PBIRCH: A Scalable Parallel Clustering algorithm for Incremental Data

Ashwani Garg, Ashish Mangla, Neelima Gupta, Vasudha Bhatnagar
2006 Proceedings - International Database Engineering and Applications Symposium  
We present a parallel version of BIRCH with the objective of enhancing the scalability without compromising on the quality of clustering. The incoming data is distributed in a cyclic manner (or block cyclic manner if the data is bursty) to balance the load among processors. The algorithm is implemented on a message passing share-nothing model. Experiments show that for very large data sets the algorithm scales nearly linearly with the increasing number of processors. Experiments also show that
more » ... nts also show that clusters obtained by PBIRCH are comparable to those obtained using BIRCH.
doi:10.1109/ideas.2006.36 dblp:conf/ideas/GargMGB06 fatcat:c4z34qa3vjhthdg2o3qthkshai