Predicting candidate genes from phenotypes, functions, and anatomical site of expression [article]

Jun Chen, Azza Althagafi, Robert Hoehndorf
2020 biorxiv/medrxiv   pre-print
Motivation: Over the past years, many computational methods have been developed to incorporate information about phenotypes for disease gene prioritization task. These methods generally compute the similarity between a patient's phenotypes and a database of gene-phenotype to find the most phenotypically similar match. The main limitation in these methods is their reliance on knowledge about phenotypes associated with particular genes, which is not complete in humans as well as in many model
more » ... nisms such as the mouse and fish. Information about functions of gene products and anatomical site of gene expression is available for more genes and can also be related to phenotypes through ontologies and machine learning models. Results: We developed a novel graph-based machine learning method for biomedical ontologies which is able to exploit axioms in ontologies and other graph-structured data. Using our machine learning method, we embed genes based on their associated phenotypes, functions of the gene products, and anatomical location of gene expression. We then develop a machine learning model to predict gene--disease associations based on the associations between genes and multiple biomedical ontologies, and this model significantly improves over state of the art methods. Furthermore, we extend phenotype-based gene prioritization methods significantly to all genes which are associated with phenotypes, functions, or site of expression. Availability: Software and data are available at https://github.com/bio-ontology-research-group/DL2Vec. Contact: robert.hoehndorf@kaust.edu.sa
doi:10.1101/2020.03.30.015594 fatcat:vsr2lpzbmvcz7jesaio7dvrd6i