A multi-step approach to time series analysis and gene expression clustering

R. Amato, A. Ciaramella, N. Deniskina, C. D. Mondo, D. di Bernardo, C. Donalek, G. Longo, G. Mangano, G. Miele, G. Raiconi, A. Staiano, R. Tagliaferri
2006 Bioinformatics  
Motivation: The huge growth in gene expression data calls for the implementation of automatic tools for data processing and interpretation. Results: We present a new and comprehensive machine learning data mining framework consisting in a non-linear PCA neural network for feature extraction, and probabilistic principal surfaces combined with an agglomerative approach based on Negentropy aimed at clustering gene microarray data. The method, which provides a user-friendly visualization interface,
more » ... lization interface, can work on noisy data with missing points and represents an automatic procedure to get, with no a priori assumptions, the number of clusters present in the data. Cell-cycle dataset and a detailed analysis confirm the biological nature of the most significant clusters. Availability: The software described here is a subpackage part of the ASTRONEURAL package and is available upon request from the corresponding author.
doi:10.1093/bioinformatics/btk026 pmid:16397005 fatcat:vtlz2pqozfa7jacuqyr5uaggm4