WeGET: predicting new genes for molecular systems by weighted co-expression

Radek Szklarczyk, Wout Megchelenbrink, Pavel Cizek, Marie Ledent, Gonny Velemans, Damian Szklarczyk, Martijn A. Huynen
2015 Nucleic Acids Research  
We have developed the Weighted Gene Expression Tool and database (WeGET, http://weget.cmbi.umcn. nl) for the prediction of new genes of a molecular system by correlated gene expression. WeGET utilizes a compendium of 465 human and 560 murine gene expression datasets that have been collected from multiple tissues under a wide range of experimental conditions. It exploits this abundance of expression data by assigning a high weight to datasets in which the known genes of a molecular system are
more » ... moniously up-and down-regulated. WeGET ranks new candidate genes by calculating their weighted coexpression with that system. A weighted rank is calculated for human genes and their mouse orthologs. Then, an integrated gene rank and p-value is computed using a rank-order statistic. We applied our method to predict novel genes that have a high degree of co-expression with Gene Ontology terms and pathways from KEGG and Reactome. For each query set we provide a list of predicted novel genes, computed weights for transcription datasets used and cell and tissue types that contributed to the final predictions. The performance for each query set is assessed by 10-fold cross-validation. Finally, users can use the WeGET to predict novel genes that coexpress with a custom query set.
doi:10.1093/nar/gkv1228 pmid:26582928 pmcid:PMC4702868 fatcat:gqmmdjvusvc5hl4pxlx4bnhfj4