Online Active Learning Ensemble Framework for Drifted Data Streams

Jicheng Shan, Hang Zhang, Weike Liu, Qingbao Liu
2018 IEEE Transactions on Neural Networks and Learning Systems  
In practical applications, data stream classification faces significant challenges, such as high cost of labeling instances and potential concept drifting. We present a new online active learning ensemble framework for drifting data streams based on a hybrid labeling strategy that includes the following: 1) an ensemble classifier, which consists of a long-term stable classifier and multiple dynamic classifiers (a multilevel sliding window model is used to create and update the dynamic
more » ... s to effectively process both the gradual drift type and sudden drift type data stream) and 2) active learning, which takes a nonfixed labeling budget, supports on-demand request labeling, and adopts an uncertainty strategy and random strategy to label instances. The decision threshold of the uncertainty strategy is adjusted dynamically, i.e., when concept drift occurs, the threshold is gradually reduced to query the most uncertain instances in priority to reduce the request expense as much as possible. Experiments on synthetic and real data sets show that precise prediction accuracy can be obtained by the proposed method without increasing the total cost of labeling, and that the labeling cost can be dynamically allocated according to the concept drift. Index Terms-Active learning, classifier ensemble, concept drift, data stream. 2162-237X
doi:10.1109/tnnls.2018.2844332 pmid:29994730 fatcat:3bhwcx5gnbfzddmpmupnwtynmy