Dynamic mixtures of factor analyzers to characterize multivariate air pollutant exposures

Antonello Maruotti, Jan Bulla, Francesco Lagona, Marco Picone, Francesca Martella
2017 Annals of Applied Statistics  
The assessment of pollution exposure is based on the analysis of multivariate time series that include the concentrations of several pollutants as well as the measurements of multiple atmospheric variables. It typically requires methods of dimensionality reduction that are capable to identify potentially dangerous combinations of pollutants and, simultaneously, to segment exposure periods according to air quality conditions. When the data are high-dimensional, however, efficient methods of
more » ... sionality reduction are challenging because of the formidable structure of cross-correlations that arise from the dynamic interaction between weather conditions and natural/anthropogenic pollution sources. In order to assess pollution exposure in an urban area while taking the above mentioned difficulties into account, we develop a class of parsimonious hidden Markov models. In a multivariate time-series setting, this approach allows to simultaneously perform temporal segmentation and dimensionality reduction. We specifically approximate the distribution of multiple pollutant concentrations by mixtures of factor analysis models, whose parameters evolve according to a latent Markov chain. Covariates are included as predictors of the chain transition probabilities. Parameter constraints on the factorial component of the model are exploited to tune the flexibility of dimensionality reduction. In order to estimate the model parameters efficiently, we propose a novel three-step Alternating Expected Conditional Maximization (AECM) algorithm, which is also assessed in a simulation study. In the case study, the proposed methods were capable (1) to describe the exposure to pollution in terms of a few latent regimes, (2) to associate these regimes with specific combinations of pollutant concentration levels as well as distinct correlation structures between concentrations, and (3) to capture the influence of weather conditions on transitions between regimes. MSC 2010 subject classifications: Primary 62-07, 62H25; secondary 62P12
doi:10.1214/17-aoas1049 fatcat:nivhimif6bfjjdbhbcy36jd5su