Variable selection in model-based clustering: A general variable role modeling

C. Maugis, G. Celeux, M.-L. Martin-Magniette
2009 Computational Statistics & Data Analysis  
The currently available variable selection procedures in model-based clustering assume that the irrelevant clustering variables are all independent or are all linked with the relevant clustering variables. We propose a more versatile variable selection model which describes three possible roles for each variable: The relevant clustering variables, the irrelevant clustering variables dependent on a part of the relevant clustering variables and the irrelevant clustering variables totally
more » ... nt of all the relevant variables. A model selection criterion and a variable selection algorithm are derived for this new variable role modeling. The model identifiability and the consistency of the variable selection criterion are also established. Numerical experiments highlight the interest of this new modeling.
doi:10.1016/j.csda.2009.04.013 fatcat:nvxurn4mqrberdbpmvtd3mqshm