Commentary on Gronau and Wagenmakers
Computational Brain & Behavior
The three examples Gronau and Wagenmakers (Computational Brain and Behavior, 2018; hereafter denoted G&W) use to demonstrate the limitations of Bayesian forms of leave-one-out cross validation (let us term this LOOCV) for model selection have several important properties: The true model instance is among the model classes being compared; the smaller, simpler model is a point hypothesis that in fact generates the data; the larger class contains the smaller. As G&W admit, there is a good deal of
... rior history pointing to the limitations of cross validation and LOOCV when used in such situations (e.g., Bernardo and Smith 1994). We do not wish to rehash this literature trail, but rather give a conceptual overview of methodology that allows discussion of the ways that various methods of model selection align with scientific practice and scientific inference, and give our recommendation for the simplest approach that matches statistical inference to the needs of science. The methods include minimum description length (MDL) as reported by Grünwald (2007) ; Bayesian model selection (BMS) as reported by Kass and Raftery (Journal of the American Statistical Association, 90, 773-795, 1995); and LOOCV as reported by Browne (Journal of Mathematical Psychology, 44, 108-132, 2000) and Gelman et al. (Statistics and Computing, 24, 997-1016, 2014). In this commentary, we shall restrict the focus to forms of BMS and LOOCV. In addition, in these days of BBig Data,^one wants inference procedures that will give reasonable answers as the amount of data grows large, one focus of the article by G&W. We discuss how the various inference procedures fare when the data grow large.