Large PSD Matrix Estimation from Partial Elements
部分要素を用いた巨大半正定類似性尺度行列の高速推定

Hiroshi KUWAJIMA, Takashi WASHIO
JSAI Technical Report, Type 2 SIG  
Under the development of ubiquitous sensing, electric documents and multi-media technologies, data sets consisting of high dimensional and massive instances have become available in various practical fields. Efficient evaluation of the similarity measures, e.g., correlations and kernels, among such instances is one of the most important tasks required by major data mining techniques, for the instance queries and clustering. However, the computational complexity of the direct computation for n
more » ... jects is O(n 2 ) which is practically intractable under the high dimensional and/or massive data, and complex similarity measures. Moreover, some scientific similarity measurements among objects take much time and cost such as the case of the gene expression experiments. The objective of this paper is to provide an efficient remedy to this problem. We propose a fast approach to estimate the similarity measures among n instances based on the partially and actually computed and/or observed similarity measures together with a mathematical constraint called "Positive Semi-Definiteness (PSD)" governing the similarity measures. The superior performance of our approach in both efficiency and accuracy of the estimation is demonstrated though the evaluation based on artificial and real world data sets.
doi:10.11517/jsaisigtwo.2007.dmsm-a701_06 fatcat:wvy6wi3wz5fuzbje4pyqifd3dm