Multivariable Statistical Correlation Measure Applied to Association Rules Mining

J. Hu, H.F Jian, J.H Sun
2015 Proceedings of the International Conference on Computer Information Systems and Industrial Applications   unpublished
Correlation is usually used in the context of real-valued sequences. However, in data mining, the values range may be of various types-real, nominal or ordinal. Regardless of their type, the methods on measuring correlation between multivariable sequences of data are reviewed. In particular, a new method on measuring the statistical correlation of multivariable sequences is proposed. As the method relies on the geometrical meaning of dot conduct to get the degree of multivariable correlation,
more » ... able correlation, it is called M-correlation. M-correlation is used to cut redundancy association rules in this paper. In order to enhance mining efficiency, a novel algorithm, namely FT-Miner, is presented to find all frequent sub-trees in a forest, using two new data structures called UFP-Tree and FP-Forest. The experimentation shows that the algorithm not only reduces a lot of unavailable rules, but also has better capability than classical algorithms.
doi:10.2991/cisia-15.2015.266 fatcat:lklhzyimabfitikl3nd3iiycfi