A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is
Microbial functional diversification is driven by environmental factors, i.e. microorganisms inhabiting the same environmental niche tend to be more functionally similar than those from different environments. In some cases, even closely phylogenetically related microbes differ more across environments than across taxa. While microbial similarities are often reported in terms of taxonomic relationships, no existing databases directly links microbial functions to the environment. We previouslydoi:10.1101/035923 fatcat:wfjyszjv3bh7vdjnmpun67kdtu
more »... veloped a method for comparing microbial functional similarities on the basis of proteins translated from the sequenced genomes. Here we describe fusionDB, a novel database that uses our functional data to represent 1,374 taxonomically distinct bacteria annotated with available metadata: habitat/niche, preferred temperature, and oxygen use. Each microbe is encoded as a set of functions represented by its proteome and individual microbes are connected via common functions. Users can search fusionDB via combinations of organism names and metadata. Moreover, the web interface allows mapping new microbial genomes to the functional spectrum of reference bacteria, rendering interactive similarity networks that highlight shared functionality. fusionDB provides a fast means of comparing microbes, identifying potential horizontal gene transfer events, and highlighting key environment-specific functionality. fusionDB is publicly available at http://services.bromberglab.org/fusiondb/.
Single nucleotide variants (SNVs) have been widely studied in the past due to being the main source of human genetic variation. Less is known about the effect of single amino acid variants (SAVs) due to the immense resources required for comprehensive experimental studies. In contrast, in silico methods predicting the effects of sequence variants upon molecular function and upon the organism are readily available and have contributed unexpected suggestions, e.g. that SAVs common to a humandoi:10.1101/2019.12.18.881318 fatcat:cdrpp4i2kfbdvk77hpnuerrszu
more »... ation (shared by >5% of the population) have, on average, more significant impact on the molecular function of proteins than do rare SAVs (shared by <1% of the population). Here, we investigated the impact of variants in a human population upon helical transmembrane proteins (TMPs). Three main results stood out. Firstly, common SAVs, on average, have stronger effects than rare SAVs for TMPs, and are enriched, in particular, in the membrane helices. Secondly, proteins with seven transmembrane helices (7TM, including GPCRs, i.e. G protein-coupled receptors) are depleted of SAVs in comparison to other proteins, possibly due to increased evolutionary constraints in these important proteins. Thirdly, rare SAVs with strong effect are significantly absent (over common SAVs) in signal peptide regions.
Motivation: The rapid drop in sequencing costs has produced many more (predicted) protein sequences than can feasibly be functionally annotated with wet-lab experiments. Thus, many computational methods have been developed for this purpose. Most of these methods employ homology-based inference, approximated via sequence alignments, to transfer functional annotations between proteins. The increase in the number of available sequences, however, has drastically increased the search space, thusdoi:10.1093/bioinformatics/bty262 pmid:29950013 pmcid:PMC6022561 fatcat:i2qq3lpoajd4pchbdzgdvmaxte
more »... ificantly slowing down alignment methods. Results: Here we describe homology-derived functional similarity of proteins (HFSP), a novel computational method that uses results of a high-speed alignment algorithm, MMseqs2, to infer functional similarity of proteins on the basis of their alignment length and sequence identity. We show that our method is accurate (85% precision) and fast (more than 40-fold speed increase over stateof-the-art). HFSP can help correct at least a 16% error in legacy curations, even for a resource of as high quality as Swiss-Prot. These findings suggest HFSP as an ideal resource for large-scale functional annotation efforts.
Accumulating evidence suggests that the human microbiome impacts individual and public health. City subway systems are human-dense environments, where passengers often exchange microbes. The MetaSUB project participants collected samples from subway surfaces in different cities and performed metagenomic sequencing. Previous studies focused on taxonomic composition of these microbiomes and no explicit functional analysis had been done till now.doi:10.1186/s13062-019-0252-y pmid:31666099 pmcid:PMC6822482 fatcat:274ny5amqba2nj35qqgnpqd7uq
Microbial functional diversification is driven by environmental factors, i.e. microorganisms inhabiting the same environmental niche tend to be more functionally similar than those from different environments. In some cases, even closely phylogenetically related microbes differ more across environments than across taxa. While microbial similarities are often reported in terms of taxonomic relationships, no existing databases directly link microbial functions to the environment. We previouslydoi:10.1093/nar/gkx1060 pmid:29112720 pmcid:PMC5753390 fatcat:ejqpk3qidvecraq3czdennosz4
more »... eloped a method for comparing microbial functional similarities on the basis of proteins translated from their sequenced genomes. Here, we describe fusionDB, a novel database that uses our functional data to represent 1374 taxonomically distinct bacteria annotated with available metadata: habitat/niche, preferred temperature, and oxygen use. Each microbe is encoded as a set of functions represented by its proteome and individual microbes are connected via common functions. Users can search fusionDB via combinations of organism names and metadata. Moreover, the web interface allows mapping new microbial genomes to the functional spectrum of reference bacteria, rendering interactive similarity networks that highlight shared functionality. fusionDB provides a fast means of comparing microbes, identifying potential horizontal gene transfer events, and highlighting key environment-specific functionality.
Any two unrelated individuals differ by about 10,000 single amino acid variants (SAVs). Do these impact molecular function? Experimental answers cannot answer comprehensively, while state-of-the-art prediction methods can. We predicted the functional impacts of SAVs within human and for variants between human and other species. Several surprising results stood out. Firstly, four methods (CADD, PolyPhen-2, SIFT, and SNAP2) agreed within 10 percentage points on the percentage of rare SAVsdoi:10.1038/s41598-017-01054-2 pmid:28487536 pmcid:PMC5431670 fatcat:kivphgy27nfvrmbjl3oxorn3kq
more »... d with effect. However, they differed substantially for the common SAVs: SNAP2 predicted, on average, more effect for common than for rare SAVs. Given the large ExAC data sets sampling 60,706 individuals, the differences were extremely significant (p-value < 2.2e-16). We provided evidence that SNAP2 might be closer to reality for common SAVs than the other methods, due to its different focus in development. Secondly, we predicted significantly higher fractions of SAVs with effect between healthy individuals than between species; the difference increased for more distantly related species. The same trends were maintained for subsets of only housekeeping proteins and when moving from exomes of 1,000 to 60,000 individuals. SAVs frozen at speciation might maintain protein function, while many variants within a species might bring about crucial changes, for better or worse. Single nucleotide variants (SNVs) constitute the most frequent form of human genetic variation 1 . Here, we focus on non-synonymous SNVs, i.e. genomic variants that result in single amino acid variants (SAVs) in protein sequences. Children differ by about two SAVs from their parents (de novo variation), while any two unrelated individuals can differ by as many as 10-20 K 2 . The vast majority (99%) of the known unique SAVs are rare, i.e. observed in less than 1% of the population 1, 3 . Only about 0.5% of the unique SAVs are common, i.e. observed in over 5% of the population 1, 3 . SAVs can impact protein function in many ways. We might be inclined to classify SAVs according to what they affect or do not affect. Effects are commonly distinguished upon protein function and structure. This distinction has limited value because what changes structure often tends to affect function. Similarly, we might distinguish between the effect upon molecular function (e.g. binding stronger or not binding), upon the role of a protein in a process (native process hampered, blocked, or non-native role acquired), or upon the localization of a protein (e.g. protein makes it to the membrane or not). Again the problem of this distinction is that these aspects are coupled: for instance, effects upon molecular function and localization might affect the process or not. All of the above, we might classify as effects upon the protein. Unfortunately, from all experiments monitoring SAV effects in many model organisms, just a few tens of thousands effects are available in public databases. For a tiny subset of these, enough detail is available to consider all effect types (structure vs. function, molecular vs. process vs. localization). We might consider the effect upon protein as molecular as opposed to the effect upon the organism, such as diseases. Toward this end, the distinction is often made between SAVs that cause severe monogenic diseases 4 (referred to as OMIM-type SAVs) or contribute to complex diseases 5 and low-effect SAVs, which are only cumulatively linked to our phenotypic Published: xx xx xxxx OPEN www.nature.com/scientificreports/ 2 Scientific RepoRts | 7: 1608 |
Developments in experimental and computational biology are advancing our understanding of how protein sequence variation impacts molecular protein function. However, the leap from the micro level of molecular function to the macro level of the whole organism, e.g. disease, remains barred. Here, we present new results emphasizing earlier work that suggested some links from molecular function to disease. We focused on non-synonymous single nucleotide variants, also referred to as single aminodoi:10.1371/journal.pcbi.1005047 pmid:27536940 pmcid:PMC4990455 fatcat:gharlccpnvbhbfpfdwwzvuosuy
more »... variants (SAVs). Building upon OMIA (Online Mendelian Inheritance in Animals), we introduced a curated set of 117 disease-causing SAVs in animals. Methods optimized to capture effects upon molecular function often correctly predict human (OMIM) and animal (OMIA) Mendelian disease-causing variants. We also predicted effects of human disease-causing variants in the mouse model, i.e. we put OMIM SAVs into mouse orthologs. Overall, fewer variants were predicted with effect in the model organism than in the original organism. Our results, along with other recent studies, demonstrate that predictions of molecular effects capture some important aspects of disease. Thus, in silico methods focusing on the micro level of molecular function can help to understand the macro system level of disease. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. diseases in human and introduced a data set for animal diseases that was also captured by predictions methods. Predicted effects were less when in silico testing human variants in an animal model (here mouse). This is important to know because "mouse models" are common to study human diseases. Overall, we provided some evidence for a link between the molecular level and some type of disease. Results and Discussion OMIM variants predicted to have strong effect SIFT  predicts the impact of variants upon molecular protein function by assessing the disruption of conserved residues. SNAP  predicts this impact by considering Predicted Sequence Variants Link to Disease PLOS Computational Biology |
We previously proposed the concept of cross-species variants (CSV) analysis (Mahlich, et al., 2017) , which is similar to but intuitively different from conservation evaluation. ... We previously reported (Mahlich, et al., 2017) that amino acid CSVs have less predicted molecular functional effects on average than human variation recorded by the Exome Aggregation Consortium (Lek ...doi:10.3389/fmolb.2021.635382 pmid:33816556 pmcid:PMC8012514 fatcat:zli2txlejfdrxihpov5jd6epem
Computational exploration of similarities among metal-binding protein structural motifs elucidates the origins of life.doi:10.1126/sciadv.abj3984 pmid:35030025 pmcid:PMC8759750 fatcat:blqt2wa4efeyldrhtjmjscbn2e
Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. if computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. here we report the results from the first large-scale community-based critical assessment of protein function annotation (CAFA) experiment. Fifty-four methods representing the state of the art fordoi:10.1038/nmeth.2340 pmid:23353650 pmcid:PMC3584181 fatcat:7rqgsz4wgfa45lb4iirt2sa4ai
more »... rotein function prediction were evaluated on a target set of 866 proteins from 11 organisms. two findings stand out: (i) today's best protein function prediction algorithms substantially outperform widely used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is considerable need for improvement of currently available tools. The accurate annotation of protein function is key to understanding life at the molecular level and has great biomedical and pharmaceutical implications. However, with its inherent difficulty and expense, experimental characterization of function cannot scale up to accommodate the vast amount of sequence data already
Jonas Reeb, Maximilian Hecht, Yannick Mahlich, Yana Bromberg, and Burkhard Rost describe predictions of molecular effects of sequence variants to bridge the gap from the micro level of molecular function ...doi:10.1371/journal.pcbi.1005043 pmid:27536867 pmcid:PMC4990343 fatcat:tw7gtnb4gvfktfpgejkoralrha
Yannick Mahlich, Yanran Wang, and Zishuo Zeng (Rutgers University) for the useful discussion and suggestions. ... metatranscriptomes under different conditions highlights the key microbial members and their molecular functions that result from and/or contribute to niche differences (Zhu, Delmont, Vogel, & Bromberg, 2015; Zhu, Mahlich ...doi:10.1002/mbo3.1100 pmid:32762019 fatcat:wbad2vc62vg7jpal7bacdy5m5u
Yannick Mahlich, Yanran Wang and Zishuo Zeng (Rutgers University) for the useful discussion and suggestions. Conflicts of Interest: The authors declare no conflict of interest. ...doi:10.1101/2020.02.07.938555 fatcat:vmmrluz4g5ccdcxslr3xb3klx4
Chengsheng Zhu, Yannick Mahlich, Yanran Wang, and Zishuo Zeng (all Rutgers) for all discussions and to Dr. Sonakshi Bhattacharjee (Columbia) for help with the manuscript. ... (Enzyme Commission) numbers, compiled as in (Mahlich, et al., 2018) . (2) We extracted all human enzymes with catalytic site annotations from the M-CSA database (Ribeiro, et al., 2018) and retained ... On the other hand, most Swiss-Prot EC annotations are annotated via function transfer by homology -a process (and some error in it (Mahlich, et al., 2018; Schnoes, et al., 2009 ) that ensure overrepresentation ...doi:10.1101/578757 fatcat:yvay3o44sfbkpmr6vqmte4225u
« Previous Showing results 1 — 15 out of 28 results