A Fast Algorithm for the Minimum Covariance Determinant Estimator

Peter J. Rousseeuw, Katrien Van Driessen
1999 Technometrics  
The minimum covariance determinant (MCD) method of Rousseeuw (1984) is a highly robust estimator of multivariate location and scatter. Its objective is to nd h observations (out of n) whose covariance matrix has the lowest determinant. Until now applications of the MCD were hampered by the computation time of existing algorithms, which were limited to a few hundred objects in a few dimensions. We discuss two important applications of larger size: one about a production process at Philips with n
more » ... = 677 objects and p = 9 variables, and a data set from astronomy with n =137,256 objects and p = 27 variables. To deal with such problems we have developed a new algorithm for the MCD, called FAST-MCD. The basic ideas are an inequality involving order statistics and determinants, and techniques which we call'selective iteration' and nested extensions'. For small data sets FAST-MCD typically nds the exact MCD, whereas for larger data sets it gives more accurate results than existing algorithms and is faster by orders of magnitude. Moreover, FAST-MCD is able to detect an exact t, i.e. a hyperplane containing h or more observations. The new algorithm makes the MCD method available as a routine tool for analyzing multivariate data. We also propose the distance-distance plot (or'D-D plot') which displays MCD-based robust distances versus Mahalanobis distances, and illustrate it with some examples.
doi:10.1080/00401706.1999.10485670 fatcat:7gfxkaahbjdblawybd7m6sbppi