Extended BIC for small-n-large-P sparse GLM

Jiahua Chen, Zehua Chen
2012 Statistica sinica  
The small-n-large-P situation has become common in genetics research, medical studies, risk management, and other fields. Feature selection is crucial in these studies yet poses a serious challenge. The traditional criteria such as AIC, BIC, and crossvalidation choose too many features. To overcome the difficulties caused by the small-n-large-P situation, Chen and Chen (2008) developed a family of extended Bayes information criteria (EBIC). Under normal linear models, EBIC is found to be
more » ... found to be consistent with nice finite sample properties. Proving consistency for non-normal and nonlinear models poses serious technical difficulties. In this paper, through a number of novel techniques, we establish the consistency of EBIC under generalized linear models in the small-n-large-P situation. We also report simulation results and a real-data analysis to illustrate the effectiveness of EBIC for feature selection.
doi:10.5705/ss.2010.216 fatcat:lszurrcxdjgr5jheewgzsabbxi