Data labeling method based on Cluster similarity using Rough Entropy for Categorical Data Clustering
International Journal of Engineering & Technology
In present research, Data mining is become one of the growing area which deals with data. Clustering is recognized as an efficient methodology in data grouping; to improve the efficiency of the clustering many researchers have used data labeling method. Labeling method works on similar data points, into the proper clusters. In categorical domain applying data labeling is not so easy when compare with numerical domain. In numeral domain it is easy to find difference between to data points, but
... data points, but in categorical it is not easy. Since data labeling on categorical is a challenging issue till date and it is quite complex to implement. The proposed methodology is deals on this problem. According proposed method a sample data will be taken. That sampled data further divides sliding windows, and then a normal clustering algorithm will be applied on one sliding window and divides into clusters. Rough membership Entropy function is used to find the similarity between unlabelled data points to labeled data points. The proposed methodology has two important features those are 1) The Data points will moved into their proper clusters, means the quality clusters will take places, 2) Proposed methodology will execute with high efficiency rate. In this paper the proposed methodology is applied on KDD Cup99 data sets, and the results shows appreciably more proficient than earlier works.