A high-performance distributed algorithm for mining association rules

Assaf Schuster, Ran Wolff, Dan Trock
2005 Knowledge and Information Systems  
We present a new distributed association rule mining (D-ARM) algorithm that demonstrates superlinear speed-up with the number of computing nodes. The algorithm is the first D-ARM algorithm to perform a single scan over the database. As such, its performance is unmatched by any previous algorithm. Scale-up experiments over standard synthetic benchmarks demonstrate stable run time regardless of the number of computers. Theoretical analysis reveals a tighter bound on error probability than the one
more » ... shown in the corresponding sequential algorithm. As a result of this tighter bound, and by utilizing the combined memory of several computers, the algorithm generates far less candidates than comparable sequential algorithms -the same order of magnitude as the optimum.
doi:10.1007/s10115-004-0176-3 fatcat:zdrh72xqnzfihmyxevyomex7s4