Data spectroscopy

Tao Shi, Mikhail Belkin, Bin Yu
2008 Proceedings of the 25th international conference on Machine learning - ICML '08  
In this paper we develop a spectral framework for estimating mixture distributions, specifically Gaussian mixture models. In physics, spectroscopy is often used for the identification of substances through their spectrum. Treating a kernel function K(x, y) as "light" and the sampled data as "substance", the spectrum of their interaction (eigenvalues and eigenvectors of the kernel matrix K) unveils certain aspects of the underlying parametric distribution p, such as the parameters of a Gaussian
more » ... ters of a Gaussian mixture. Our approach extends the intuitions and analyses underlying the existing spectral techniques, such as spectral clustering and Kernel Principal Components Analysis (KPCA). We construct algorithms to estimate parameters of Gaussian mixture models, including the number of mixture components, their means and covariance matrices, which are important in many practical applications. We provide a theoretical framework and show encouraging experimental results.
doi:10.1145/1390156.1390274 dblp:conf/icml/ShiBY08 fatcat:omgids4au5ddbo53pbpcbwl4ge