A note on posterior predictive checks to assess model fit for incomplete data

Dandan Xu, Arkendu Chatterjee, Michael Daniels
2016 Statistics in Medicine  
We examine two posterior predictive distribution based approaches to assess model fit for incomplete longitudinal data. The first approach assesses fit based on replicated complete data as advocated in Gelman et al. (2005) . The second approach assesses fit based on replicated observed data. Differences between the two approaches are discussed and an analytic example is presented for illustration and understanding. Both checks are applied to data from a longitudinal clinical trial. The proposed
more » ... checks can easily be implemented in standard software like (Win)BUGS/ JAGS/Stan. Incomplete Data Model Fit Notation and Review-To introduce posterior predictive checks for incomplete longitudinal data, we need to first introduce some notation and concepts. Let Y i : i = 1, ... , n denote the J -dimensional longitudinal response vector (with components Y ij : j = 1, ... , J ) for individual i and Y = (Y 1 , ... , Y n ). Let R be the vector, ordered as Y, of observed data indicators; i.e., R ij = I{Y ij is observed} and let Y obs be {Y ij : r ij = 1}. The full data is given as (y, r); the observed data as (y obs , r). The extrapolation factorization of the full data model is, Xu et al.
doi:10.1002/sim.7040 pmid:27426216 pmcid:PMC5096987 fatcat:ajxz6yb635fddccqnks7jxa2tq