Batch Active Learning With Two-Stage Sampling

Ronghua Luo, Xiang Wang
2020 IEEE Access  
Due to its effectiveness in training precise model using significant fewer labeled instances, active learning has been widely researched and applied. In order to reduce the time complexity of active learning so that the oracle need not wait for the algorithm to provide instance in labeling, we proposed a new active learning method, which leverages batch sampling and direct boundary annotation with a two-stage sampling strategy. In the first stage sampling, the initial seed, which determines the
more » ... location of boundary annotation, is selected with reject sampling based on the clustering structure of instances to ensure the initial seeds can approximate the distribution of data and with high diversity. In the second stage sampling, by treating the instance sampling as the selection of representative in a local region and maximizing the rewards that can get from selecting a instance as the new representative, we proposed a novel mechanism to maintain local representativeness and diversity of query instances. Compared with the conventional poolbased active learning method, our proposed method does not need to train the model in each iteration, which reduces the amount of calculation and time consumption. The experimental results in three public datasets show that the proposed method has comparable performance with the uncertainty-based active learning methods, which proves that the sampling mechanism in our method is effective. It performs well without retraining the model in each iteration and does not rely on the precision of the model. INDEX TERMS Active learning, boundary annotation, instance sampling, generative adversarial model. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ VOLUME 8, 2020
doi:10.1109/access.2020.2979315 fatcat:ubpnmseyf5hwtcl322ejnxb5ae