Approximate Data Mining Using Sketches for Massive Data

Parul Gupta, Swati Agnihotri, Suman Saha
2013 Procedia Technology - Elsevier  
With the popularity of the Web and Internet, massive data is generated.However, this enormous datasets present the challenge to apply data mining techniques in order to extract useful information. Dimensionality reduction can be used to improve both efficiency and effectiveness while extracting information from data. In this paper we have proposed an algorithm to reduce the dimensionality of the datasets such that after applying data mining techniques on reduced datasets we get almost same
more » ... ts as with the original datasets. Random Sketch is used to reduce the dimensions of the dataset.
doi:10.1016/j.protcy.2013.12.422 fatcat:uzlxbsewpvf3famrs2tnrkqwyq