Mining frequent itemsets over tuple-evolving data streams

Chongsheng Zhang, Yuan Hao, Mirjana Mazuran, Carlo Zaniolo, Hamid Mousavi, Florent Masseglia
2013 Proceedings of the 28th Annual ACM Symposium on Applied Computing - SAC '13  
In many data streaming applications today, tuples inside the streams may get revised over time. This type of data stream brings new issues and challenges to the data mining tasks. We present a theoretical analysis for mining frequent itemsets from sliding windows over such data. We define conditions that determine whether an infrequent itemset will become frequent when some existing tuples inside the streams have been updated. We design simple but effective structures for managing both the
more » ... ing tuples and the candidate frequent itemsets. Moreover, we provide a novel verification method that efficiently computes the counts of candidate itemsets. Experiments on real-world datasets show the efficiency and effectiveness of our proposed method.
doi:10.1145/2480362.2480419 dblp:conf/sac/ZhangHMZMM13 fatcat:vrnebwooirahhejqn56xwscwmy