Discovering causality in large databases

Shichao Zhang, Chengqi Zhang
2002 Applied Artificial Intelligence  
A causal rule between two variables, X ! Y , captures the relationship that the presence of X causes the appearance of Y. Because of its usefulness (compared to association rules), techniques for mining causal rules are beginning to be developed. However, the effectiveness of existing methods (such as the L CD and CU-path algorithms) are limited to mining causal rules among simple variables, and are inadequate to discover and represent causal rules among multi-value variables. In this paper, we
more » ... propose that the causality between variables X and Y be represented in the form X ! Y with conditional probability matrix M Y jX : We also propose a new approach to discover causality in large databases based on partitioning. The approach partitions the items into item variables by decomposing"bad' ' item variables and composing"not-good" item variables. In particular, we establish a method to optimize causal rules that merges the"useless" information in conditional probability matrices of extracted causal rules. We would like to thank the anonymous reviewers for their good comments on this paper.
doi:10.1080/08839510290030264 fatcat:kx3b734y7rdljmxhimm6h6ywjm