On Spectral Learning of Mixtures of Distributions [chapter]

Dimitris Achlioptas, Frank McSherry
2005 Lecture Notes in Computer Science  
We consider the problem of learning mixtures of distributions via spectral methods and derive a tight characterization of when such methods are useful. Specifically, given a mixture-sample, let µ i , C i , w i denote the empirical mean, covariance matrix, and mixing weight of the i-th component. We prove that a very simple algorithm, namely spectral projection followed by single-linkage clustering, properly classifies every point in the sample when each µ i is separated from all µ j by C i 2
more » ... w i +1/w j ) 1/2 plus a term that depends on the concentration properties of the distributions in the mixture. This second term is very small for many distributions, including Gaussians, Log-concave, and many others. As a result, we get the best known bounds for learning mixtures of arbitrary Gaussians in terms of the required mean separation. On the other hand, we prove that given any k means µ i and mixing weights w i , there are (many) sets of matrices C i such that each µ i is separated from all µ j by C i 2 (1/w i + 1/w j ) 1/2 , but applying spectral projection to the corresponding Gaussian mixture causes it to collapse completely, i.e., all means and covariance matrices in the projected mixture are identical.
doi:10.1007/11503415_31 fatcat:73dbu4fv4jdqzdhx2gwhejlvrq