Semiparametric Mixed-Scale Models Using Shared Bayesian Forests [article]

Antonio R. Linero, Debajyoti Sinha, Stuart R. Lipsitz
2019 arXiv   pre-print
This paper demonstrates the advantages of sharing information about unknown features of covariates across multiple model components in various nonparametric regression problems including multivariate, heteroscedastic, and semi-continuous responses. In this paper, we present methodology which allows for information to be shared nonparametrically across various model components using Bayesian sum-of-tree models. Our simulation results demonstrate that sharing of information across related model
more » ... mponents is often very beneficial, particularly in sparse high-dimensional problems in which variable selection must be conducted. We illustrate our methodology by analyzing medical expenditure data from the Medical Expenditure Panel Survey (MEPS). To facilitate the Bayesian nonparametric regression analysis, we develop two novel models for analyzing the MEPS data using Bayesian additive regression trees - a heteroskedastic log-normal hurdle model with a "shrink-towards-homoskedasticity" prior, and a gamma hurdle model.
arXiv:1809.08521v4 fatcat:gykhcbkd2ndzxe76mekmsk4g4u