Training Classifiers under Covariate Shift by Constructing the Maximum Consistent Distribution Subset

Xu Yu, Miao Yu, Li-xun Xu, Jing Yang, Zhi-qiang Xie
2015 Mathematical Problems in Engineering  
The assumption that the training and testing samples are drawn from the same distribution is violated under covariate shift setting, and most algorithms for the covariate shift setting try to first estimate distributions and then reweight samples based on the distributions estimated. Due to the difficulty of estimating a correct distribution, previous methods can not get good classification performance. In this paper, we firstly present two types of covariate shift problems. Rather than
more » ... Rather than estimating the distributions, we then desire an effective method to select a maximum subset following the target testing distribution based on feature space split from the auxiliary set or the target training set. Finally, we prove that our subset selection method can consistently deal with both scenarios of covariate shift. Experimental results demonstrate that training a classifier with the selected maximum subset exhibits good generalization ability and running efficiency over those of traditional methods under covariate shift setting.
doi:10.1155/2015/302815 fatcat:y5wifenerbe4riu3cfjsms5xnm