The value of Bayesian predictive projection for variable selection: an example of selecting lifestyle predictors of young adult well-being
A. Bartonicek, S. R. Wickham, N. Pat, T. S. Conner
2021
BMC Public Health
Background Variable selection is an important issue in many fields such as public health and psychology. Researchers often gather data on many variables of interest and then are faced with two challenging goals: building an accurate model with few predictors, and making probabilistic statements (inference) about this model. Unfortunately, it is currently difficult to attain these goals with the two most popular methods for variable selection methods: stepwise selection and LASSO. The aim of the
more »
... present study was to demonstrate the use predictive projection feature selection – a novel Bayesian variable selection method that delivers both predictive power and inference. We apply predictive projection to a sample of New Zealand young adults, use it to build a compact model for predicting well-being, and compare it to other variable selection methods. Methods The sample consisted of 791 young adults (ages 18 to 25, 71.7% female) living in Dunedin, New Zealand who had taken part in the Daily Life Study in 2013–2014. Participants completed a 13-day online daily diary assessment of their well-being and a range of lifestyle variables (e.g., sleep, physical activity, diet variables). The participants' diary data was averaged across days and analyzed cross-sectionally to identify predictors of average flourishing. Predictive projection was used to select as few predictors as necessary to approximate the predictive accuracy of a reference model with all 28 predictors. Predictive projection was also compared to other variable selection methods, including stepwise selection and LASSO. Results Three predictors were sufficient to approximate the predictions of the reference model: higher sleep quality, less trouble concentrating, and more servings of fruit. The performance of the projected submodel generalized well. Compared to other variable selection methods, predictive projection produced models with either matching or slightly worse performance; however, this performance was achieved with much fewer predictors. Conclusion Predictive projection was used to efficiently arrive at a compact model with good predictive accuracy. The predictors selected into the submodel – felt refreshed after waking up, had less trouble concentrating, and ate more servings of fruit – were all theoretically meaningful. Our findings showcase the utility of predictive projection in a practical variable selection problem.
doi:10.1186/s12889-021-10690-3
pmid:33836714
pmcid:PMC8033696
fatcat:shjm3izqrvfzjk3o4d75kis7ue