Evaluation and selection of models for out-of-sample prediction when the sample size is small relative to the complexity of the data-generating process

Hannes Leeb
2008 Bernoulli  
In regression with random design, we study the problem of selecting a model that performs well for out-of-sample prediction. We do not assume that any of the candidate models under consideration are correct. Our analysis is based on explicit finite-sample results. Our main findings differ from those of other analyses that are based on traditional large-sample limit approximations because we consider a situation where the sample size is small relative to the complexity of the data-generating
more » ... data-generating process, in the sense that the number of parameters in a 'good' model is of the same order as sample size. Also, we allow for the case where the number of candidate models is (much) larger than sample size.
doi:10.3150/08-bej127 fatcat:4odxmzkz6re2zooixbsun5voam