Subset selection by Mallows' Cp: A mixed integer programming approach

Ryuhei Miyashiro, Yuichi Takano
2015 Expert systems with applications  
This paper concerns a method of selecting the best subset of explanatory variables for a linear regression model. Employing Mallows' C p as a goodness-of-fit measure, we formulate the subset selection problem as a mixed integer quadratic programming problem. Computational results demonstrate that our method provides the best subset of variables in a few seconds when the number of candidate explanatory variables is less than 30. Furthermore, when handling datasets consisting of a large number of
more » ... samples, it finds better-quality solutions faster than stepwise regression methods do.
doi:10.1016/j.eswa.2014.07.056 fatcat:buyje3xkzvfonhizjwxsxw26fa