A Bayesian approach to analyse overdispersed longitudinal count data

Fernanda B. Rizzato, Roseli A. Leandro, Clarice G.B. Demétrio, Geert Molenberghs
2015 Journal of Applied Statistics  
Generalized linear models [20] have unified the regression methodology for a wide variety of discrete and continuous responses that can be assumed to be independent. Considering the random variables Y 1 , . . . , Y n i.i.d., with vector of means µ = (µ 1 , . . . , µ n ), the three components of a GLM are: (i) the response variable Y i stemming from the exponential family of distributions with a density of the form: where θ i is the canonical parameter and φ > 0 is the dispersion parameter; (ii)
more » ... ion parameter; (ii) a linear predictor vector η given by η = Xβ, where β is a vector of p unknown parameters and X = (x 1 , . . . , x n ) T is an n × p design matrix; (iii) a link function g(.) relating the mean to the linear predictor, i.e., η i = g(µ i ) and µ i = g −1 (x T i β). Our focus in this paper is on the particular GLM with Poisson distribution and log link function. Assuming that the random variables Y i represent counts with means λ i , the standard Poisson model assumes that Y i ∼ Poisson(λ i ) with probability distribution f Yi (y i ) = e −λi λ i yi y i ! , y i = 0, 1, 2, . . . where λ i = exp(x i T β). Parameter estimation conventionally proceeds by maximum likelihood using the iteratively re-weighted least square algorithm. The Poisson model is based on two strong assumptions: i) independence among measurements and ii) the same mean λ for all individuals treated equally. The failure of one or both of these basic assumptions will induce the observed variance to differ from the mean, the theoretical variance of the Poisson distribution i.e., the aforementioned overdispersion. Hinde and Demétrio [8] provide general treatments of overdispersion for univariate data. Underdispersion is possible as well in some settings (e.g., when there is competition for scarce resources among cluster members). Considering a Poisson model, if the residual deviance is much larger than its associated degrees of freedom, there is evidence of lack-of-fit of the model and probably a problem of overdispersion. An initial exploratory analysis of the epilepsy data suggested two options. The first option is, for each individual, to consider a Poisson model with a linear predictor η = log(µ) = β 0 + β 1 t fitted to the counts, resulting in 45 analyses for
doi:10.1080/02664763.2015.1126812 fatcat:3atlcyco4rbtbeqkhoz5m3lbx4