Penalized Clustering of Large-Scale Functional Data With Multiple Covariates

Ping Ma, Wenxuan Zhong
2008 Journal of the American Statistical Association  
In this article we propose a penalized clustering method for large-scale data with multiple covariates through a functional data approach. In our proposed method, responses and covariates are linked together through nonparametric multivariate functions (fixed effects), which have great flexibility in modeling various function features, such as jump points, branching, and periodicity. Functional ANOVA is used to further decompose multivariate functions in a reproducing kernel Hilbert space and
more » ... ovide associated notions of main effect and interaction. Parsimonious random effects are used to capture various correlation structures. The mixed-effects models are nested under a general mixture model in which the heterogeneity of functional data is characterized. We propose a penalized Henderson's likelihood approach for model fitting and design a rejection-controlled EM algorithm for the estimation. Our method selects smoothing parameters through generalized cross-validation. Furthermore, Bayesian confidence intervals are used to measure the clustering uncertainty. Simulation studies and real-data examples are presented to investigate the empirical performance of the proposed method. Open-source code is available in the R package MFDA.
doi:10.1198/016214508000000247 fatcat:kqybmdrkynhmrbpvyyr3ipdpga