A Variance Reduction Framework for Stable Feature Selection

Yue Han, Lei Yu
2010 2010 IEEE International Conference on Data Mining  
Besides high accuracy, stability of feature selection has recently attracted strong interest in knowledge discovery from high-dimensional data. In this study, we present a theoretical framework about the relationship between the stability and accuracy of feature selection based on a formal bias-variance decomposition of feature selection error. The framework also suggests a variance reduction approach for improving the stability of feature selection algorithms. Furthermore, we propose an
more » ... al variance reduction framework, margin based instance weighting, which weights training instances according to their influence to the estimation of feature relevance. We also develop an efficient algorithm under this framework. Experiments based on synthetic data and real-world microarray data verify both the theoretical framework and the effectiveness of the proposed algorithm on variance reduction. The proposed algorithm is also shown to be effective at improving subset stability, while maintaining comparable classification accuracy based on selected features.
doi:10.1109/icdm.2010.144 dblp:conf/icdm/HanY10 fatcat:3zipero2f5dlxgjjpjvlzp5ojm