An Efficient Mining Algorithm by Bit Vector Table for Frequent Closed Itemsets

Keming Tang, Caiyan Dai, Ling Chen
2011 Journal of Software  
Mining frequent closed itemsets in data streams is an important task in stream data mining. In this paper, an efficient mining algorithm (denoted as EMAFCI) for frequent closed itemsets in data stream is proposed. The algorithm is based on the sliding window model, and uses a Bit Vector Table (denoted as BVTable) where the transactions and itemsets are recorded by the column and row vectors respectively. The algorithm first builds the BVTable for the first sliding window. Frequent closed
more » ... s can be detected by pair-test operations on the binary numbers in the table. After building the first BVTable, the algorithm updates the BVTable for each sliding window. The frequent closed itemsets in the sliding window can be identified from the BVTable. Algorithms are also proposed to modify BVTable when adding and deleting a transaction. The experimental results on synthetic and real data sets indicate that the proposed algorithm needs less CPU time and memory than other similar methods.
doi:10.4304/jsw.6.11.2121-2128 fatcat:tgfjld6tozcfxmahjtrvt5plhy