Resource-oriented approximation for frequent itemset mining from bursty data streams

Yoshitaka Yamamoto, Koji Iwanuma, Shoshi Fukuda
2014 Proceedings of the 2014 ACM SIGMOD international conference on Management of data - SIGMOD '14  
This study considers approximation techniques for frequent itemset mining from data streams (FIM-DS) under resource constraints. In FIM-DS, a challenging problem is handling a huge combinatorial number of entries (i.e., itemsets) to be generated from each streaming transaction and stored in memory. Various types of approximation methods have been proposed for FIM-DS. However, these methods require almost O(2 L ) space for the maximal length L of transactions. If some transaction contains sudden
more » ... and intensive bursty events for a short span, they cannot work since memory consumption exponentially increases as L becomes larger. Thus, we present resource-oriented approximation algorithms that fix an upper bound for memory consumption to tolerate bursty transactions. The proposed algorithm requires only O(k) space for a resource-specified constant k and processes every transaction in O(kL) time. Consequently, the proposed algorithm can treat any transaction without memory overflow nor fatal response delay, while the output can be guaranteed to be no false negative under some conditions. Moreover, any (even if false negative) output is bounded within the approximation error which is dynamically determined in a resource-oriented manner. From an empirical viewpoint, it is necessary to maintain the error as low as possible. We tackle this problem by dynamically reducing the original stream. Through experimental results, we show that the resource-oriented approach can break the space limitation of previously proposed FIM-DS methods.
doi:10.1145/2588555.2612171 dblp:conf/sigmod/YamamotoIF14 fatcat:7a3jjpsn7zfiden6msylmnuz5q