Global analysis of human genetic variations in protein-coding regions

Eman Alhuzimi, Michael Sternberg, King Abdulaziz City For Science
2019
Do genes enriched in genetic variations that have been linked to human disease have different biological properties than genes enriched in variants that have never been related to disease? What are the characteristics of genes enriched in rare variants, and can they represent the missing link in disease aetiology? The answers to these questions could shed light on the architecture of pathogenesis process and may improve the future studies of genetic disorders. To that end, we analysed 481,277
more » ... re variants (minor allele frequency <0.01), 81,822 common variants (minor allele frequency ≥0.01), and 26,884 disease-causing variants occurring in the coding region of 17,975 human protein-coding genes. Three novel sets of genes were identified: genes enriched in rare variants (32 genes), genes enriched in common variants (282 genes), and genes enriched in disease-causing variants (800 genes). Our analysis presented consistent results obtained from several well-established tools which showed that genes enriched in rare variants have far greater similarities in the biological and network properties to genes enriched in disease-causing variants, than to genes enriched in common variants. Thus, genetic variants in these genes are strong candidates for disease, and their identification should prompt further in vitro analyses as they may represent the missing link in disease heritability. All the data used in the analysis in addition to other biological data will be publicly available through the development of a dedicated database and website. We expect the website to become the foundation for understanding the molecular details of the different types of genetic variants, which in turn will be of a great benefit to the medical community.
doi:10.25560/68381 fatcat:icyzdjdvhjfshkveueqq4czloi