Survey on Distributed Data Mining in P2P Networks [article]

Rekha Sunny T, Sabu M. Thampi
2012 arXiv   pre-print
The exponential increase of availability of digital data and the necessity to process it in business and scientific fields has literally forced upon us the need to analyze and mine useful knowledge from it. Traditionally data mining has used a data warehousing model of gathering all data into a central site, and then running an algorithm upon that data. Such a centralized approach is fundamentally inappropriate due to many reasons like huge amount of data, infeasibility to centralize data
more » ... at multiple sites, bandwidth limitation and privacy concerns. To solve these problems, Distributed Data Mining (DDM) has emerged as a hot research area. Careful attention in the usage of distributed resources of data, computing, communication, and human factors in a near optimal fashion are paid by distributed data mining. DDM is gaining attention in peer-to-peer (P2P) systems which are emerging as a choice of solution for applications such as file sharing, collaborative movie and song scoring, electronic commerce, and surveillance using sensor networks. The main intension of this draft paper is to provide an overview of DDM and P2P Data Mining. The paper discusses the need for DDM, taxonomy of DDM architectures, various DDM approaches, DDM related works in P2P systems and issues and challenges in P2P data mining.
arXiv:1205.3231v1 fatcat:5tajkiqlg5hufjrhiy3xzz4d4m