Learning an un-Supervised – Clustering Algorithm Monte Carlo over Consensus Clustering for Genomic Data for Tumor Identification

2019 International journal of recent technology and engineering  
Clustering involves the grouping of similar objects into a set known as cluster. Objects in one cluster are likely to be different when compared to objects grouped under another cluster. Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product. Subgroup classification is a basic task in high-throughput genomic data analysis, especially for gene expression and methylation data analysis. Mostly, unsupervised clustering methods are
more » ... d to predict new subgroups or test the consistency with known annotations. To get a stable classification of subgroups, consensus clustering is always performed. It clusters repeatedly with a randomly sampled subset of data and summarizes the robustness of the clustering. When faced with significant uncertainty in the process of making a forecast or estimation, the Monte Carlo Simulation might prove to be a better solution. Monte Carlo3C is a consensus clustering algorithm that uses a Monte Carlo simulation to eliminate overfitting and can reject the null hypothesis when only one cluster is there.
doi:10.35940/ijrte.d7370.118419 fatcat:unvmsfxyibcdvadywkwhrszljq