A Requirements Analysis for Parallel KDD Systems [chapter]

William A. Maniatty, Mohammed J. Zaki
2000 Lecture Notes in Computer Science  
The current generation of data mining tools have limited capacity and performance, since these tools tend to be sequential. This paper explores a migration path out of this bottleneck by considering an integrated hardware and software approach to parallelize data mining. Our analysis shows that parallel data mining solutions require the following components: parallel data mining algorithms, parallel and distributed data bases, parallel file systems, parallel I/O, tertiary storage, management of
more » ... online data, support for heterogeneous data representations, security, quality of service and pricing metrics. State of the art technology in these areas is surveyed with an eye towards an integration strategy leading to a complete solution.
doi:10.1007/3-540-45591-4_47 fatcat:ekp6fbwpufh3bd4y6o5e6syz2i