Class Proportion Estimation with Application to Multiclass Anomaly Rejection [article]

Tyler Sanderson, Clayton Scott
2014 arXiv   pre-print
This work addresses two classification problems that fall under the heading of domain adaptation, wherein the distributions of training and testing examples differ. The first problem studied is that of class proportion estimation, which is the problem of estimating the class proportions in an unlabeled testing data set given labeled examples of each class. Compared to previous work on this problem, our approach has the novel feature that it does not require labeled training data from one of the
more » ... classes. This property allows us to address the second domain adaptation problem, namely, multiclass anomaly rejection. Here, the goal is to design a classifier that has the option of assigning a "reject" label, indicating that the instance did not arise from a class present in the training data. We establish consistent learning strategies for both of these domain adaptation problems, which to our knowledge are the first of their kind. We also implement the class proportion estimation technique and demonstrate its performance on several benchmark data sets.
arXiv:1306.5056v3 fatcat:hou5hm7sljhgvbkjbbftiw4mb4