Transaction Encoding Algorithm (TEA) for Distributed Data

A. Anbarasi, D. Sathyasrinivas, K. Vivekanandan
2011 International Journal of Computer Applications  
Analysis of huge datasets has been a major concern in almost all areas of technology in the past decade and the role of data mining has become so crucial as a result of this crisis. As the data sizes in these datasets increase, from gigabytes to terabytes or even larger the complexity in collecting and warehousing these massive dataset as such in a single site is practically impossible as it may not have enough main memory to hold all the data. Therefore they are accumulated usually in
more » ... cally distributed sites. The challenge in distributed data mining is how to learn as much knowledge from distributed databases as we do from the centralized database without costing too much communication bandwidth. A solution to distributed data mining is that the massive dataset can be collected and warehoused in a single site if its dimensionality is reduced. The dimension reduction algorithms are generally classified into feature selection, feature extraction and random projection. In this paper we propose a dimension reduction algorithm, which is different from all of these methods, to encode the transactions which reduce the size of transaction that in turn reduces the communication cost. Experimental results on a datasets demonstrate the performance of our proposed algorithm.
doi:10.5120/2030-2580 fatcat:gbgcted5o5dkbidjlcczwmet2a