K Means Clustering Algorithm for Partitioning Data Sets Evaluated From Horizontal Aggregations

R. Rakesh Kumar
2013 IOSR Journal of Computer Engineering  
Data mining refers to the process of analyzing the data from different perspectives and summarizing it into useful information that is mostly used by the different users for analyzing the data as well as for preparing data sets. A data set is collection of data that is present in the tabular form. Preparing data set involves complex SQL queries, joining tables and aggregate functions. Traditional RDBMS manages the tables with vertical format and returns one number per row. It means that it
more » ... ns a single value output which is not suitable for preparing a data set. This paper mainly focused on k means clustering algorithm which is used to partition data sets after horizontal aggregations and a small description about the horizontal aggregation methods which returns set of numbers instead of one number per row. This paper consists of three methods that is SPJ, CASE and PIVOT methods in order to evaluate horizontal aggregations. Horizontal aggregations results in large volumes of data sets which are then partitioned into homogeneous clusters is important in the system. This can be performed by k means clustering algorithm.
doi:10.9790/0661-1254548 fatcat:nak4zfkxpndhlonpccgfs4ub3i