Entropy-Based Experimental Design for Optimal Model Discrimination in the Geosciences
Choosing between competing models lies at the heart of scientific work, and is a frequent motivation for experimentation. Optimal experimental design (OD) methods maximize the benefit of experiments towards a specified goal. We advance and demonstrate an OD approach to maximize the information gained towards model selection. We make use of so-called model choice indicators, which are random variables with an expected value equal to Bayesian model weights. Their uncertainty can be measured with
... n be measured with Shannon entropy. Since the experimental data are still random variables in the planning phase of an experiment, we use mutual information (the expected reduction in Shannon entropy) to quantify the information gained from a proposed experimental design. For implementation, we use the Preposterior Data Impact Assessor framework (PreDIA), because it is free of the lower-order approximations of mutual information often found in the geosciences. In comparison to other studies in statistics, our framework is not restricted to sequential design or to discrete-valued data, and it can handle measurement errors. As an application example, we optimize an experiment about the transport of contaminants in clay, featuring the problem of choosing between competing isotherms to describe sorption. We compare the results of optimizing towards maximum model discrimination with an alternative OD approach that minimizes the overall predictive uncertainty under model choice uncertainty. Entropy 2016, 18, 409 2 of 26 contradicting predictions or imply different scientific conclusions. Specific examples in the area of geosciences include competing models for non-Fickian transport of dissolved chemicals in moving fluids , different laws for the sorption of dissolved chemicals onto solids [6,7], different versions of Darcy's law for flow in variably-saturated porous media , different equations that relate viscosity to other thermodynamic properties of fluids  , and so forth. One possible reaction to this situation is to see models merely as working hypotheses . This implies that several competing models should be suggested, and each tested against available data  . Such tests are often complex due to several involved uncertainties  , which typically originate from data scarcity, scale disparity, and other sources of model uncertainty  . The Bayesian version  of testing several hypothesized models against a common set of data is called Bayesian model selection (BMS, Raftery ). BMS is based on posterior model probabilities that reflect a compromise between the performance of a model and its degree of (over-)complexity. Due to its rigorous statistical foundation, BMS has become increasingly popular in the geosciences (e.g., ). A second possible reaction is to accept the entire set of models as plausible alternatives. Then, one lets each model predict a statistical distribution, and combines their individual predictive probability distributions into an overall distribution. The combined distribution covers both parametric uncertainties and the uncertainty of model choice. The most common framework that represents this approach is Bayesian model averaging (BMA, Hoeting et al. ). BMA has frequently been applied in the field of geosciences (e.g,    ), because it allows for the explicit quantification of the uncertainty due to model choice (e.g.,    ). Both approaches (model testing or model averaging) require data in order to infer the parameters of each model, and in order to evaluate the likelihood of the models in the light of the data. This is referred to as two levels of inference by MacKay  . Data, however, can be expensive to acquire-especially when gained from sophisticated experiments. As an example, one may think about experiments that take a long time because slow processes must be observed (e.g., diffusion over larger distances  ), that require deep drilling into rock formations , or where expensive materials are consumed.