Filters








51 Hits in 1.3 sec

HIV Protein Sequence Hotspots for Crosstalk with Host Hub Proteins

Mahdi Sarmady, William Dampier, Aydin Tozeren, Denis Dupuy
2011 PLoS ONE  
HIV proteins target host hub proteins for transient binding interactions. The presence of viral proteins in the infected cell results in out-competition of host proteins in their interaction with hub proteins, drastically affecting cell physiology. Functional genomics and interactome datasets can be used to quantify the sequence hotspots on the HIV proteome mediating interactions with host hub proteins. In this study, we used the HIV and human interactome databases to identify HIV targeted host
more » ... hub proteins and their host binding partners (H2). We developed a high throughput computational procedure utilizing motif discovery algorithms on sets of protein sequences, including sequences of HIV and H2 proteins. We identified as HIV sequence hotspots those linear motifs that are highly conserved on HIV sequences and at the same time have a statistically enriched presence on the sequences of H2 proteins. The HIV protein motifs discovered in this study are expressed by subsets of H2 host proteins potentially outcompeted by HIV proteins. A large subset of these motifs is involved in cleavage, nuclear localization, phosphorylation, and transcription factor binding events. Many such motifs are clustered on an HIV sequence in the form of hotspots. The sequential positions of these hotspots are consistent with the curated literature on phenotype altering residue mutations, as well as with existing binding site data. The hotspot map produced in this study is the first global portrayal of HIV motifs involved in altering the host protein network at highly connected hub nodes.
doi:10.1371/journal.pone.0023293 pmid:21858059 pmcid:PMC3156123 fatcat:ogivfvzz55hwleerff3flfpfsi

The Contractual Relations in Multi-level Marketing Companies

Mehrdad Mehrbakhsh, Mahdi Abbasi Sarmadi
2014 Islamic Law Research  
In a MLM company, distributors whole buy the products and retail to end users out of a fixed commercial location. In addition to margin, they get retail commission for every sale. Also every distributor is able to recruit, train and sponsor new distributors – directly and indirectly- and get a percentage as an overall commission. In this way, there are two contracts between company and distributor in a MLM company. One is sales contract with a fiscal exchanged act condition in implied which
more » ... ains the products purchase contract by distributor and getting retail commission by selling it to end users. The other is proxy contract whereby distributor (as the company's proxy) act to recruit, organize, sponsor and train continuously new distributors. In practice it isn't convened any contract between distributors, but with respecting the way they interact, the governing contract between them which recognized by law, is quasi lease contract.
doi:10.30497/law.2014.1574 doaj:e15d25b43cd048b2917ab6bb2f8c2862 fatcat:52oxnkaojbga7jqbvdnnxipi7a

Diagnosing Cornelia de Lange syndrome and related neurodevelopmental disorders using RNA-sequencing [article]

Stefan Rentas, Komal Rathi, Maninder Kaur, Pichai Raman, Ian Krantz, Mahdi Sarmady, Ahmad Abou Tayoun
2019 medRxiv   pre-print
Neurodevelopmental phenotypes represent major indications for children undergoing clinical exome sequencing. However, up to 50% of cases remain undiagnosed even upon periodic exome reanalysis. RNA sequencing (RNA-seq) can boost diagnostic yield in neuromuscular diseases, but its utility in neurodevelopmental disorders is hampered out of concern for sourcing relevant tissue for RNA analysis. Here we show human B lymphoblastoid cell lines (LCL) share the transcriptional repertoire of brain tissue
more » ... for a large subset of neurodevelopmental Mendelian genes, enabling testing of over 1000 genetic syndromes. LCLs also showed a 1.8-fold increase in the number of genes causing neurodevelopmental phenotypes when compared to whole blood (LCL, n = 1706; whole blood, n = 917) indicating a more robust testing landscape. We applied an RNA-seq diagnostic pipeline on LCLs from patients with Cornelia de Lange syndrome (CdLS), a rare multisystem neurodevelopment disorder and found 100% sensitivity for detection of abnormal splicing and 90% sensitivity for detecting all pathogenic events. Application of the pipeline on unsolved cases of CdLS revealed abnormal splicing and pathogenic coding variants in NIPBL and BRD4. This work demonstrates that the LCL transcriptome enables broad, frontline or reflexive, diagnostic testing for neurodevelopmental disorders.
doi:10.1101/19008300 fatcat:ebbw3d4sajbp5meqf675tznaea

Genetic variant pathogenicity prediction trained using large-scale disease specific clinical sequencing datasets [article]

Perry Evans, Chao Wu, Amanda Lindy, Dianalee McKnight, Matthew Lebo, Mahdi Sarmady, Ahmad Abou Tayoun
2018 bioRxiv   pre-print
Recent advances in high-throughput DNA sequencing technologies have expanded our understanding of the molecular underpinnings of various genetic disorders and have led to increased utilization of genomic tests by clinicians. However, each test can generate thousands of variants, and given the paucity of functional studies assessing each one of them, experimental validation of a variants clinical significance is not feasible for clinical laboratories. Therefore, many variants are reported as
more » ... ants of unknown clinical significance due to this gap. However, the creation of large variant databases like the Genome Aggregation Database has significantly improved the interpretation of novel variants. Specifically, pathogenicity prediction for novel missense variants can now utilize features describing regional variant constraint. Constrained genomic regions are those that have an unusually low variant counts in the general population. Earlier pathogenicity classifiers tried to capture these regions using protein domains. Methods and Findings: Here we introduce one of the largest variant datasets derived from clinical sequencing panels to assess the utility of using old and new concepts of regional features as pathogenicity scores. This dataset is compiled from 17,071 patients surveyed with clinical genomic sequencing for cardiomyopathy, epilepsy, or rasopathies. We use this dataset to justify the necessity of disease specific classifiers, and train PathoPredictor, a disease specific ensemble classifier of pathogenicity based on regional constraint and variant level features. Conclusion: Disease specific features improve missense variant pathogenicity prediction. As such, PathoPredictor achieves an average precision greater than 90% for variants from all 112 tested disease genes while approaching 100% accuracy for some genes, making it superior to existing generic pathogenicity metrics it uses as features.
doi:10.1101/334235 fatcat:ivp5blsefjd6vlkl3hzng26o2i

ExomeSlicer: a resource for the development and validation of exome-based clinical panels [article]

Rojeen Niazi, Michael A Gonzalez, Jorune Balciuniene, Perry Evans, Mahdi Sarmady, Ahmad N Abou Tayoun
2018 bioRxiv   pre-print
Exome-based panels (exome slices) are becoming the preferred diagnostic strategy especially for genetically heterogeneous disorders. The advantages of this approach include enabling frequent updates to gene content without the need for re-designing, reflexing to exome analysis bioinformatically without requiring additional sequencing, and streamlining laboratory operation by using established exome kits and protocols. Despite their increasing use, there are currently no guidelines or
more » ... resources to support their clinical implementation. Here, we highlight principles and important considerations for the clinical development and validation of exome-based panels, guided by clinical data from a diagnostic epilepsy panel using this approach. We also present a novel, publically accessible web-based resource, ExomeSlicer, and demonstrate its clinical utility in predicting gene-specific and exome-wide technically challenging regions that are not amenable to Next Generation Sequencing (NGS), and that might significantly lead to increased post hoc Sanger fill in burden. Using this tool, we also characterize > 2000 low complexity, GC-rich and/or high homology, regions across the exome that can be a source of false positive or false negative variant calls thus potentially leading to misdiagnoses in tested patients. NOTE: RN and MAG. are co-first authors on this manuscript.
doi:10.1101/248906 fatcat:tk2fzxc6qfa3zdpylignexfrhm

Using Machine Learning to Facilitate Classification of Somatic Variants from Next-Generation Sequencing [article]

Chao Wu, Xiaonan Zhao, Mark Welsh, Kellianne Costello, Kajia Cao, Ahmad Abou Tayoun, Marilyn Li, Mahdi Sarmady
2019 bioRxiv   pre-print
AbstractBackgroundMolecular profiling has become essential for tumor risk stratification and treatment selection. However, cancer genome complexity and technical artifacts make identification of real variants a challenge. Currently, clinical laboratories rely on manual screening, which is costly, subjective, and not scalable. Here we present a machine learning-based method to distinguish artifacts from bona fide Single Nucleotide Variants (SNVs) detected by NGS from tumor specimens.MethodsA
more » ... rt of 11,278 SNVs identified through clinical sequencing of tumor specimens were collected and divided into training, validation, and test sets. Each SNV was manually inspected and labeled as either real or artifact as part of clinical laboratory workflow. A three-class (real, artifact and uncertain) model was developed on the training set, fine-tuned using the validation set, and then evaluated on the test set. Prediction intervals reflecting the certainty of the classifications were derived during the process to label "uncertain" variants.ResultsThe optimized classifier demonstrated 100% specificity and 97% sensitivity over 5,587 SNVs of the test set. 1,252 out of 1,341 true positive variants were identified as real, 4,143 out of 4,246 false positive calls were deemed artifacts, while only 192(3.4%) SNVs were labeled as "uncertain" with zero misclassification between the true positives and artifacts in the test set.ConclusionsWe presented a computational classifier to identify variant artifacts detected from tumor sequencing. Overall, 96.6% of the SNVs received a definitive label and thus were exempt from manual review. This framework could improve quality and efficiency of variant review process in clinical labs.
doi:10.1101/670687 fatcat:7agyw6pyb5fnjirvmoltdfblo4

AnthOligo: Automating the design of oligonucleotides for capture/enrichment technologies [article]

Pushkala Jayaraman, Timothy Mosbruger, Taishan Hu, Nikolaos G Tairis, Chao Wu, Peter M Clark, Monica D'Arcy, Deborah Ferriola, Katarzyna Mackiewicz, Xiaowu Gai, Dimitrios Monos, Mahdi Sarmady
2019 bioRxiv   pre-print
AbstractSummaryA number of methods have been devised to address the need for targeted genomic resequencing. One of these methods, Region-specific extraction (RSE) of DNA is characterized by the capture of long DNA fragments (15-20 kb) by magnetic beads, after enzymatic extension of oligonucleotides hybridized to selected genomic regions. Facilitating the selection of the most optimal capture oligos targeting a region of interest, satisfying the properties of temperature (Tm) and entropy (ΔG),
more » ... ile minimizing the formation of primer dimers in a pooled experiment is therefore necessary. Manual design and selection of oligos becomes an extremely arduous task complicated by factors such as length of the target region and number of targeted regions. Here we describe, AnthOligo, a web-based application developed to optimally automate the process of generation of oligo sequences to be used for the targeting and capturing the continuum of large and complex genomic regions. Apart from generating oligos for RSE, this program may have wider applications in the design of customizable internal oligos to be used as baits for gene panel analysis or even probes for large-scale comparative genomic hybridization (CGH) array processes.Implementation and AvailabilityThe application written in Java8 and run on Tomcat9 is a lightweight Java Spring MVC framework that provides the user with a simple interface to upload an input file in BED format and customize parameters for each task. A Redis-like MapReduce framework is implemented to run sub-tasks in parallel to optimize time and system resources alongside a 'task-queuing' system that runs submitted jobs as a server-side background daemon. The task of probe design in AnthOligo commences when a user uploads an input file and concludes with the generation of a result-set containing an optimal set of region-specific oligos.AnthOligo is currently available as a public web application with URL: http://antholigo.chop.edu.
doi:10.1101/2019.12.12.873497 fatcat:nczdd5mq55bxrofvig6rj27kgq

Phen2Gene: Rapid Phenotype-Driven Gene Prioritization for Rare Diseases [article]

Mengge Zhao, James M Havrilla, Li Fang, Ying Chen, Jacqueline Peng, Cong Liu, Chao Wu, Mahdi Sarmady, Pablo Botas, Julian Isla, Gholson Lyon, Chunhua Weng (+1 others)
2019 bioRxiv   pre-print
Human Phenotype Ontology (HPO) terms are increasingly used in diagnostic settings to aid in the characterization of patient phenotypes. The HPO annotation database is updated frequently and can provide detailed phenotype knowledge on various human diseases, and many HPO terms are now mapped to candidate causal genes with binary relationships. To further improve the genetic diagnosis of rare diseases, we incorporated these HPO annotations, gene-disease databases, and gene-gene databases in a
more » ... abilistic model to build a novel HPO-driven gene prioritization tool, Phen2Gene. Phen2Gene accesses a database built upon this information called the HPO2Gene Knowledgebase (H2GKB), which provides weighted and ranked gene lists for every HPO term. Phen2Gene is then able to access the H2GKB for patient-specific lists of HPO terms or PhenoPackets descriptions supported by GA4GH (http://phenopackets.org/), calculate a prioritized gene list based on a probabilistic model, and output gene-disease relationships with great accuracy. Phen2Gene outperforms existing gene prioritization tools in speed, and acts as a real-time phenotype driven gene prioritization tool to aid the clinical diagnosis of rare undiagnosed diseases. In addition to a command line tool released under the MIT license (https://github.com/WGLab/Phen2Gene), we also developed a web server and web service (https://phen2gene.wglab.org/) for running the tool via web interface or RESTful API queries. Finally, we have curated a large amount of benchmarking data for phenotype-to-gene tools involving 197 patients across 76 scientific articles and 85 patients' de-identified HPO term data from CHOP.
doi:10.1101/870527 fatcat:j5xhkvij3zaqbflf57gaddl3va

Phenomenology of Writing Skill in Order to Provide Solutions to Improve Writing Skills

Elahe Kalantari Dehaghi, Seyed Mahdi Sajjadi, Mohammad Reza Sarmadi, Zohreh Esmaeili
2015 Mediterranean Journal of Social Sciences  
History of study in the cognitive literature of writing skills, as one of the core skills of language, goes back to the eighties AD (Fayazbakhsh, 2006) . From the education field experts point of view, writing approach include a series of audio-visual, linguistic and perceptual capabilities, have known as the best vehicles and most widely field of creative thinking and mass production of artistic wisdom (Hadian, 2006) and as one of the major life, training and education skills, plays a decisive
more » ... role in better living and academic achievement (Fayazbakhsh, 2006) . In recent decade, which is coincided with the formation of the information age and the advent of fourth generation of technologies, namely information -communication technologies, training and Elearning with features such as Hypertext and Hypermedia (Rahmani, 2007), there has been a great emphasis on the use of writing skills (Zaman poor, 2010) in order to produce knowledge (Attaran, 2009) . Due to the increasing development of the use of writing skills through the application of new learning technologies, this research pursues three primary goals: First, examine the characteristics of writing skills and the reasons of its importance. Second, reviews the concerns resulting from the use of writing skills in virtual spaces. Third, offers the guidelines in order to reduce concerns and improve writing skills among learners. The mentioned study has a qualitative approach. The method that is used in this research is documentary research. After gathering the information from library sources and websites, the concepts has been analyzed, interpreted and then concluded. 119 that while they writing for themselves, there is no need for correct spelling and writing of words. Only when the author intended his work to be published, there is a need for editing. If the child feels free or different about his work, will be more interested in writing. 26. When the child is writing, we should not interfere in his work and his curiosity. When he thinks, what is written, is really his own, he'll enjoy and write more. Indifference or persistent criticism about child writings makes it more difficult to write. After some encouragements, we can have a few constructive supervisions and criticism, and elegant of his writings, especially when he is on the starting path to write. 27. A child can be asked to write about impossible things. The more the story is not realistic, the better it would be. This make the child to use his/her mind, imagination more. 28. Sometimes it is necessary to write the first words to help children (Salahshour, 2004) .
doi:10.5901/mjss.2015.v6n6s6p114 fatcat:4rjbp577kjhwfkznt7xkebrxiy

Genetic variant pathogenicity prediction trained using disease-specific clinical sequencing data sets

Perry Evans, Chao Wu, Amanda Lindy, Dianalee A. McKnight, Matthew Lebo, Mahdi Sarmady, Ahmad N. Abou Tayoun
2019 Genome Research  
Recent advances in DNA sequencing have expanded our understanding of the molecular basis of genetic disorders and increased the utilization of clinical genomic tests. Given the paucity of evidence to accurately classify each variant and the difficulty of experimentally evaluating its clinical significance, a large number of variants generated by clinical tests are reported as variants of unknown clinical significance. Population-scale variant databases can improve clinical interpretation.
more » ... ically, pathogenicity prediction for novel missense variants can use features describing regional variant constraint. Constrained genomic regions are those that have an unusually low variant count in the general population. Computational methods have been introduced to capture these regions and incorporate them into pathogenicity classifiers, but these methods have yet to be compared on an independent clinical variant data set. Here, we introduce one variant data set derived from clinical sequencing panels and use it to compare the ability of different genomic constraint metrics to determine missense variant pathogenicity. This data set is compiled from 17,071 patients surveyed with clinical genomic sequencing for cardiomyopathy, epilepsy, or RASopathies. We further use this data set to demonstrate the necessity of disease-specific classifiers and to train PathoPredictor, a disease-specific ensemble classifier of pathogenicity based on regional constraint and variant-level features. PathoPredictor achieves an average precision >90% for variants from all 99 tested disease genes while approaching 100% accuracy for some genes. The accumulation of larger clinical variant training data sets can significantly enhance their performance in a disease- and gene-specific manner.
doi:10.1101/gr.240994.118 pmid:31235655 pmcid:PMC6633260 fatcat:qut7viy5l5h2bdwbaude35ubx4

Efficient digest of high-throughput sequencing data in a reproducible report

Zhe Zhang, Jeremy Leipzig, Ariella Sasson, Angela M Yu, Juan C Perin, Hongbo M Xie, Mahdi Sarmady, Patrick V Warren, Peter S White
2013 BMC Bioinformatics  
High-throughput sequencing (HTS) technologies are spearheading the accelerated development of biomedical research. Processing and summarizing the large amount of data generated by HTS presents a nontrivial challenge to bioinformatics. A commonly adopted standard is to store sequencing reads aligned to a reference genome in SAM (Sequence Alignment/Map) or BAM (Binary Alignment/Map) files. Quality control of SAM/BAM files is a critical checkpoint before downstream analysis. The goal of the
more » ... project is to facilitate and standardize this process. Results: We developed bamchop, a robust program to efficiently summarize key statistical metrics of HTS data stored in BAM files, and to visually present the results in a formatted report. The report documents information about various aspects of HTS data, such as sequencing quality, mapping to a reference genome, sequencing coverage, and base frequency. Bamchop uses the R language and Bioconductor packages to calculate statistical matrices and the Sweave utility and associated LaTeX markup for documentation. Bamchop's efficiency and robustness were tested on BAM files generated by local sequencing facilities and the 1000 Genomes Project. Source code, instruction and example reports of bamchop are freely available from https://github.com/CBMi-BiG/ bamchop. Conclusions: Bamchop enables biomedical researchers to quickly and rigorously evaluate HTS data by providing a convenient synopsis and user-friendly reports.
doi:10.1186/1471-2105-14-s11-s3 pmid:24564231 pmcid:PMC3846741 fatcat:glvvga2zxjeotkdd4mvso26yde

Reviewing of a Child's Age and its Legal Status from the Perspective of International Documents

mahdi sarmadi abbasi, ahdiyeh zangi ahrami
2016 Fiqh va ḥuqūq-i khānavādah  
Article I of the Convention on the Rights of the Child, adopted in 1989, recognizes 18 years of age to be the ranging pole between childhood and adult ratio. At the same time it allows member states to apply lesser age for the definition of the "child" age. Exceptions also exist in some articles such as Article 37 regarding criminal liability, does not allow the death penalty and life imprisonment without the possibility of release in the case of persons less than 18 years. Article 38 concerns
more » ... he prohibition of countries for recruiting/ employing persons less than 15 years in armed conflicts are examples of such issues where as a certain age has been taken into consideration for children and countries are bound to follow it. The main question in the present paper is what is the age for the basis of the performance of the Member States? In accordance with the prevailing trend in the international community the child's legal age is 18 years of age, but given the existing ambiguities, it is necessary to improve the present situation and clarifying the children's rights, so that, in addition to the clarification of this ambiguity regarding the legal status of the child's age and legal conditions in its ruling, such as issues related to child labor, marriage, participation in armed conflicts and criminal responsibility be examined.
doi:10.30497/flj.2016.56648 doaj:d5b853bb727349adbd565ec4819c58b1 fatcat:zr44l4pqireynhnyk3hsjpb4da

Transcriptome analysis of IL-10-stimulated (M2c) macrophages by next-generation sequencing

Emily B. Lurier, Donald Dalton, Will Dampier, Pichai Raman, Sina Nassiri, Nicole M. Ferraro, Ramakrishan Rajagopalan, Mahdi Sarmady, Kara L. Spiller
2017 Immunobiology  
Alternatively activated "M2" macrophages are believed to function during late stages of wound healing, behaving in an anti-inflammatory manner to mediate the resolution of the proinflammatory response caused by "M1" macrophages. However, the differences between two main subtypes of M2 macrophages, namely interleukin-4 (IL-4)-stimulated "M2a" macrophages and IL-10-stimulated "M2c" macrophages, are not well understood. M2a macrophages are characterized by their ability to inhibit inflammation and
more » ... contribute to the stabilization of angiogenesis. However, the role and temporal profile of M2c macrophages in wound healing are not known. Therefore, we performed next generation sequencing (RNA-seq) to identify biological functions and gene expression signatures of macrophages polarized in vitro with IL-10 to the M2c phenotype in comparison to M1 and M2a macrophages and an unactivated control (M0). We then explored the expression of these gene signatures in a publicly available data set of human wound healing. RNA-seq analysis showed that hundreds of genes were upregulated in M2c macrophages compared to the M0 control, with thousands of alternative splicing events. Following validation by Nanostring, 39 genes were found to be upregulated by M2c macrophages compared to the M0 control, and 17 genes were significantly upregulated relative to the M0, M1, and M2a phenotypes (using an adjusted p-value cutoff of 0.05 and fold change cutoff of 1.5). Many of the identified M2c-specific genes are associated with angiogenesis, matrix remodeling, and phagocytosis, including CD163, MMP8, TIMP1, VCAN, SERPINA1, MARCO, PLOD2, PCOCLE2 and F5. showed that M2c macrophages secreted higher levels of MMP7, MMP8, and TIMP1 compared to the other phenotypes. Interestingly, temporal gene expression analysis of a publicly available microarray data set of human wound healing showed that M2c-related genes were upregulated at early times after injury, similar to M1-related genes, while M2a-related genes appeared at later stages or were downregulated after injury. While further studies are required to confirm the timing and role of M2c macrophages in vivo, these results suggest that M2c macrophages may function at early stages of wound healing. Identification of markers of the M2c phenotype will allow more detailed investigations into the role of M2c macrophages in vivo.
doi:10.1016/j.imbio.2017.02.006 pmid:28318799 pmcid:PMC5719494 fatcat:d7rzlzuhovh23dosuy5jonmkg4

Sequence- and Interactome-Based Prediction of Viral Protein Hotspots Targeting Host Proteins: A Case Study for HIV Nef

Mahdi Sarmady, William Dampier, Aydin Tozeren, Jianming Qiu
2011 PLoS ONE  
Virus proteins alter protein pathways of the host toward the synthesis of viral particles by breaking and making edges via binding to host proteins. In this study, we developed a computational approach to predict viral sequence hotspots for binding to host proteins based on sequences of viral and host proteins and literature-curated virus-host protein interactome data. We use a motif discovery algorithm repeatedly on collections of sequences of viral proteins and immediate binding partners of
more » ... eir host targets and choose only those motifs that are conserved on viral sequences and highly statistically enriched among binding partners of virus protein targeted host proteins. Our results match experimental data on binding sites of Nef to host proteins such as MAPK1, VAV1, LCK, HCK, HLA-A, CD4, FYN, and GNB2L1 with high statistical significance but is a poor predictor of Nef binding sites on highly flexible, hoop-like regions. Predicted hotspots recapture CD8 cell epitopes of HIV Nef highlighting their importance in modulating virus-host interactions. Host proteins potentially targeted or outcompeted by Nef appear crowding the T cell receptor, natural killer cell mediated cytotoxicity, and neurotrophin signaling pathways. Scanning of HIV Nef motifs on multiple alignments of hepatitis C protein NS5A produces results consistent with literature, indicating the potential value of the hotspot discovery in advancing our understanding of virushost crosstalk.
doi:10.1371/journal.pone.0020735 pmid:21738584 pmcid:PMC3125164 fatcat:t6zzyuczsbffbhipwchl33hvje

Phen2Gene: rapid phenotype-driven gene prioritization for rare diseases

Mengge Zhao, James M Havrilla, Li Fang, Ying Chen, Jacqueline Peng, Cong Liu, Chao Wu, Mahdi Sarmady, Pablo Botas, Julián Isla, Gholson J Lyon, Chunhua Weng (+1 others)
2020 NAR Genomics and Bioinformatics  
Human Phenotype Ontology (HPO) terms are increasingly used in diagnostic settings to aid in the characterization of patient phenotypes. The HPO annotation database is updated frequently and can provide detailed phenotype knowledge on various human diseases, and many HPO terms are now mapped to candidate causal genes with binary relationships. To further improve the genetic diagnosis of rare diseases, we incorporated these HPO annotations, gene–disease databases and gene–gene databases in a
more » ... bilistic model to build a novel HPO-driven gene prioritization tool, Phen2Gene. Phen2Gene accesses a database built upon this information called the HPO2Gene Knowledgebase (H2GKB), which provides weighted and ranked gene lists for every HPO term. Phen2Gene is then able to access the H2GKB for patient-specific lists of HPO terms or PhenoPacket descriptions supported by GA4GH (http://phenopackets.org/), calculate a prioritized gene list based on a probabilistic model and output gene–disease relationships with great accuracy. Phen2Gene outperforms existing gene prioritization tools in speed and acts as a real-time phenotype-driven gene prioritization tool to aid the clinical diagnosis of rare undiagnosed diseases. In addition to a command line tool released under the MIT license (https://github.com/WGLab/Phen2Gene), we also developed a web server and web service (https://phen2gene.wglab.org/) for running the tool via web interface or RESTful API queries. Finally, we have curated a large amount of benchmarking data for phenotype-to-gene tools involving 197 patients across 76 scientific articles and 85 patients' de-identified HPO term data from the Children's Hospital of Philadelphia.
doi:10.1093/nargab/lqaa032 pmid:32500119 pmcid:PMC7252576 fatcat:c4nz6ivayfd3lgys6noz2okwtm
« Previous Showing results 1 — 15 out of 51 results