An Accurate Grid -based PAM Clustering Method for Large Dataset

Faisal BinAlAbid, M.A. Mottalib
2012 International Journal of Computer Applications  
Clustering is the procedure to group similar objects together. Several algorithms have been proposed for clustering. Among them, the K-means clustering method has less time complexity. But it is sensitive to extreme values and would cause less accurate clustering of the dataset. However, Kmedoids method does not have such limitations. But this method uses user-defined value for K. Therefore, if the number of clusters is not chosen correctly, it will not provide the natural number of clusters
more » ... mber of clusters and hence the accuracy will be minimized. In this paper, we propose a grid based clustering method that has higher accuracy than the existing K-medoids algorithm. Our proposed Grid Multi-dimensional K-medoids (GMK) algorithm uses the concept of cluster validity index and it is shown from the experimental results that the new proposed method has higher accuracy than the existing Kmedoids method. The object space is quantized into a number of cells, and the distance between the intra cluster objects decrease which contributes to the higher accuracy of the proposed method. Therefore, the proposed approach has higher accuracy and provides natural clustering method which scales well for large dataset.
doi:10.5120/5821-7808 fatcat:7v3p7o4nlnc5zn5dhydmlaxsqm