A nonparametric Bayesian analysis of heterogeneous treatment effects in digital experimentation [article]

Matt Taddy, Matt Gardner, Liyun Chen, David Draper
2015 arXiv   pre-print
Randomized controlled trials play an important role in how Internet companies predict the impact of policy decisions and product changes. In these 'digital experiments', different units (people, devices, products) respond differently to the treatment. This article presents a fast and scalable Bayesian nonparametric analysis of such heterogeneous treatment effects and their measurement in relation to observable covariates. New results and algorithms are provided for quantifying the uncertainty
more » ... sociated with treatment effect measurement via both linear projections and nonlinear regression trees (CART and Random Forests). For linear projections, our inference strategy leads to results that are mostly in agreement with those from the frequentist literature. We find that linear regression adjustment of treatment effect averages (i.e., post-stratification) can provide some variance reduction, but that this reduction will be vanishingly small in the low-signal and large-sample setting of digital experiments. For regression trees, we provide uncertainty quantification for the machine learning algorithms that are commonly applied in tree-fitting. We argue that practitioners should look to ensembles of trees (forests) rather than individual trees in their analysis. The ideas are applied on and illustrated through an example experiment involving 21 million unique users of EBay.com.
arXiv:1412.8563v4 fatcat:f63jqxjhr5h7znxka6iww77ck4