Filters








28 Hits in 2.0 sec

fusionDB: assessing microbial diversity and environmental preferences via functional similarity networks [article]

Chengsheng Zhu, Yannick Mahlich, Yana Bromberg
2016 bioRxiv   pre-print
Microbial functional diversification is driven by environmental factors, i.e. microorganisms inhabiting the same environmental niche tend to be more functionally similar than those from different environments. In some cases, even closely phylogenetically related microbes differ more across environments than across taxa. While microbial similarities are often reported in terms of taxonomic relationships, no existing databases directly links microbial functions to the environment. We previously
more » ... veloped a method for comparing microbial functional similarities on the basis of proteins translated from the sequenced genomes. Here we describe fusionDB, a novel database that uses our functional data to represent 1,374 taxonomically distinct bacteria annotated with available metadata: habitat/niche, preferred temperature, and oxygen use. Each microbe is encoded as a set of functions represented by its proteome and individual microbes are connected via common functions. Users can search fusionDB via combinations of organism names and metadata. Moreover, the web interface allows mapping new microbial genomes to the functional spectrum of reference bacteria, rendering interactive similarity networks that highlight shared functionality. fusionDB provides a fast means of comparing microbes, identifying potential horizontal gene transfer events, and highlighting key environment-specific functionality. fusionDB is publicly available at http://services.bromberglab.org/fusiondb/.
doi:10.1101/035923 fatcat:wfjyszjv3bh7vdjnmpun67kdtu

An exhaustive analysis of single amino acid variants in helical transmembrane proteins [article]

Oscar Llorian-Salvador, Michael Bernhofer, Yannick Mahlich, Burkhard Rost
2019 bioRxiv   pre-print
Single nucleotide variants (SNVs) have been widely studied in the past due to being the main source of human genetic variation. Less is known about the effect of single amino acid variants (SAVs) due to the immense resources required for comprehensive experimental studies. In contrast, in silico methods predicting the effects of sequence variants upon molecular function and upon the organism are readily available and have contributed unexpected suggestions, e.g. that SAVs common to a human
more » ... ation (shared by >5% of the population) have, on average, more significant impact on the molecular function of proteins than do rare SAVs (shared by <1% of the population). Here, we investigated the impact of variants in a human population upon helical transmembrane proteins (TMPs). Three main results stood out. Firstly, common SAVs, on average, have stronger effects than rare SAVs for TMPs, and are enriched, in particular, in the membrane helices. Secondly, proteins with seven transmembrane helices (7TM, including GPCRs, i.e. G protein-coupled receptors) are depleted of SAVs in comparison to other proteins, possibly due to increased evolutionary constraints in these important proteins. Thirdly, rare SAVs with strong effect are significantly absent (over common SAVs) in signal peptide regions.
doi:10.1101/2019.12.18.881318 fatcat:cdrpp4i2kfbdvk77hpnuerrszu

HFSP: high speed homology-driven function annotation of proteins

Yannick Mahlich, Martin Steinegger, Burkhard Rost, Yana Bromberg
2018 Bioinformatics  
Motivation: The rapid drop in sequencing costs has produced many more (predicted) protein sequences than can feasibly be functionally annotated with wet-lab experiments. Thus, many computational methods have been developed for this purpose. Most of these methods employ homology-based inference, approximated via sequence alignments, to transfer functional annotations between proteins. The increase in the number of available sequences, however, has drastically increased the search space, thus
more » ... ificantly slowing down alignment methods. Results: Here we describe homology-derived functional similarity of proteins (HFSP), a novel computational method that uses results of a high-speed alignment algorithm, MMseqs2, to infer functional similarity of proteins on the basis of their alignment length and sequence identity. We show that our method is accurate (85% precision) and fast (more than 40-fold speed increase over stateof-the-art). HFSP can help correct at least a 16% error in legacy curations, even for a resource of as high quality as Swiss-Prot. These findings suggest HFSP as an ideal resource for large-scale functional annotation efforts.
doi:10.1093/bioinformatics/bty262 pmid:29950013 pmcid:PMC6022561 fatcat:i2qq3lpoajd4pchbdzgdvmaxte

Fingerprinting cities: differentiating subway microbiome functionality

Chengsheng Zhu, Maximilian Miller, Nick Lusskin, Yannick Mahlich, Yanran Wang, Zishuo Zeng, Yana Bromberg
2019 Biology Direct  
Accumulating evidence suggests that the human microbiome impacts individual and public health. City subway systems are human-dense environments, where passengers often exchange microbes. The MetaSUB project participants collected samples from subway surfaces in different cities and performed metagenomic sequencing. Previous studies focused on taxonomic composition of these microbiomes and no explicit functional analysis had been done till now.
doi:10.1186/s13062-019-0252-y pmid:31666099 pmcid:PMC6822482 fatcat:274ny5amqba2nj35qqgnpqd7uq

fusionDB: assessing microbial diversity and environmental preferences via functional similarity networks

Chengsheng Zhu, Yannick Mahlich, Maximilian Miller, Yana Bromberg
2017 Nucleic Acids Research  
Microbial functional diversification is driven by environmental factors, i.e. microorganisms inhabiting the same environmental niche tend to be more functionally similar than those from different environments. In some cases, even closely phylogenetically related microbes differ more across environments than across taxa. While microbial similarities are often reported in terms of taxonomic relationships, no existing databases directly link microbial functions to the environment. We previously
more » ... eloped a method for comparing microbial functional similarities on the basis of proteins translated from their sequenced genomes. Here, we describe fusionDB, a novel database that uses our functional data to represent 1374 taxonomically distinct bacteria annotated with available metadata: habitat/niche, preferred temperature, and oxygen use. Each microbe is encoded as a set of functions represented by its proteome and individual microbes are connected via common functions. Users can search fusionDB via combinations of organism names and metadata. Moreover, the web interface allows mapping new microbial genomes to the functional spectrum of reference bacteria, rendering interactive similarity networks that highlight shared functionality. fusionDB provides a fast means of comparing microbes, identifying potential horizontal gene transfer events, and highlighting key environment-specific functionality.
doi:10.1093/nar/gkx1060 pmid:29112720 pmcid:PMC5753390 fatcat:ejqpk3qidvecraq3czdennosz4

Common sequence variants affect molecular function more than rare variants?

Yannick Mahlich, Jonas Reeb, Maximilian Hecht, Maria Schelling, Tjaart Andries Petrus De Beer, Yana Bromberg, Burkhard Rost
2017 Scientific Reports  
Any two unrelated individuals differ by about 10,000 single amino acid variants (SAVs). Do these impact molecular function? Experimental answers cannot answer comprehensively, while state-of-the-art prediction methods can. We predicted the functional impacts of SAVs within human and for variants between human and other species. Several surprising results stood out. Firstly, four methods (CADD, PolyPhen-2, SIFT, and SNAP2) agreed within 10 percentage points on the percentage of rare SAVs
more » ... d with effect. However, they differed substantially for the common SAVs: SNAP2 predicted, on average, more effect for common than for rare SAVs. Given the large ExAC data sets sampling 60,706 individuals, the differences were extremely significant (p-value < 2.2e-16). We provided evidence that SNAP2 might be closer to reality for common SAVs than the other methods, due to its different focus in development. Secondly, we predicted significantly higher fractions of SAVs with effect between healthy individuals than between species; the difference increased for more distantly related species. The same trends were maintained for subsets of only housekeeping proteins and when moving from exomes of 1,000 to 60,000 individuals. SAVs frozen at speciation might maintain protein function, while many variants within a species might bring about crucial changes, for better or worse. Single nucleotide variants (SNVs) constitute the most frequent form of human genetic variation 1 . Here, we focus on non-synonymous SNVs, i.e. genomic variants that result in single amino acid variants (SAVs) in protein sequences. Children differ by about two SAVs from their parents (de novo variation), while any two unrelated individuals can differ by as many as 10-20 K 2 . The vast majority (99%) of the known unique SAVs are rare, i.e. observed in less than 1% of the population 1, 3 . Only about 0.5% of the unique SAVs are common, i.e. observed in over 5% of the population 1, 3 . SAVs can impact protein function in many ways. We might be inclined to classify SAVs according to what they affect or do not affect. Effects are commonly distinguished upon protein function and structure. This distinction has limited value because what changes structure often tends to affect function. Similarly, we might distinguish between the effect upon molecular function (e.g. binding stronger or not binding), upon the role of a protein in a process (native process hampered, blocked, or non-native role acquired), or upon the localization of a protein (e.g. protein makes it to the membrane or not). Again the problem of this distinction is that these aspects are coupled: for instance, effects upon molecular function and localization might affect the process or not. All of the above, we might classify as effects upon the protein. Unfortunately, from all experiments monitoring SAV effects in many model organisms, just a few tens of thousands effects are available in public databases. For a tiny subset of these, enough detail is available to consider all effect types (structure vs. function, molecular vs. process vs. localization). We might consider the effect upon protein as molecular as opposed to the effect upon the organism, such as diseases. Toward this end, the distinction is often made between SAVs that cause severe monogenic diseases 4 (referred to as OMIM-type SAVs) or contribute to complex diseases 5 and low-effect SAVs, which are only cumulatively linked to our phenotypic Published: xx xx xxxx OPEN www.nature.com/scientificreports/ 2 Scientific RepoRts | 7: 1608 |
doi:10.1038/s41598-017-01054-2 pmid:28487536 pmcid:PMC5431670 fatcat:kivphgy27nfvrmbjl3oxorn3kq

Predicted Molecular Effects of Sequence Variants Link to System Level of Disease

Jonas Reeb, Maximilian Hecht, Yannick Mahlich, Yana Bromberg, Burkhard Rost, Rachel Karchin
2016 PLoS Computational Biology  
Developments in experimental and computational biology are advancing our understanding of how protein sequence variation impacts molecular protein function. However, the leap from the micro level of molecular function to the macro level of the whole organism, e.g. disease, remains barred. Here, we present new results emphasizing earlier work that suggested some links from molecular function to disease. We focused on non-synonymous single nucleotide variants, also referred to as single amino
more » ... variants (SAVs). Building upon OMIA (Online Mendelian Inheritance in Animals), we introduced a curated set of 117 disease-causing SAVs in animals. Methods optimized to capture effects upon molecular function often correctly predict human (OMIM) and animal (OMIA) Mendelian disease-causing variants. We also predicted effects of human disease-causing variants in the mouse model, i.e. we put OMIM SAVs into mouse orthologs. Overall, fewer variants were predicted with effect in the model organism than in the original organism. Our results, along with other recent studies, demonstrate that predictions of molecular effects capture some important aspects of disease. Thus, in silico methods focusing on the micro level of molecular function can help to understand the macro system level of disease. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing Interests: The authors have declared that no competing interests exist. diseases in human and introduced a data set for animal diseases that was also captured by predictions methods. Predicted effects were less when in silico testing human variants in an animal model (here mouse). This is important to know because "mouse models" are common to study human diseases. Overall, we provided some evidence for a link between the molecular level and some type of disease. Results and Discussion OMIM variants predicted to have strong effect SIFT [27] predicts the impact of variants upon molecular protein function by assessing the disruption of conserved residues. SNAP [17] predicts this impact by considering Predicted Sequence Variants Link to Disease PLOS Computational Biology |
doi:10.1371/journal.pcbi.1005047 pmid:27536940 pmcid:PMC4990455 fatcat:gharlccpnvbhbfpfdwwzvuosuy

Low Diversity of Human Variation Despite Mostly Mild Functional Impact of De Novo Variants

Yannick Mahlich, Maximillian Miller, Zishuo Zeng, Yana Bromberg
2021 Frontiers in Molecular Biosciences  
We previously proposed the concept of cross-species variants (CSV) analysis (Mahlich, et al., 2017) , which is similar to but intuitively different from conservation evaluation.  ...  We previously reported (Mahlich, et al., 2017) that amino acid CSVs have less predicted molecular functional effects on average than human variation recorded by the Exome Aggregation Consortium (Lek  ... 
doi:10.3389/fmolb.2021.635382 pmid:33816556 pmcid:PMC8012514 fatcat:zli2txlejfdrxihpov5jd6epem

Quantifying structural relationships of metal-binding sites suggests origins of biological electron transfer

Yana Bromberg, Ariel A. Aptekmann, Yannick Mahlich, Linda Cook, Stefan Senn, Maximillian Miller, Vikas Nanda, Diego U. Ferreiro, Paul G. Falkowski
2022 Science Advances  
Computational exploration of similarities among metal-binding protein structural motifs elucidates the origins of life.
doi:10.1126/sciadv.abj3984 pmid:35030025 pmcid:PMC8759750 fatcat:blqt2wa4efeyldrhtjmjscbn2e

Homology-based inference sets the bar high for protein function prediction

Tobias Hamp, Rebecca Kassner, Stefan Seemayer, Esmeralda Vicedo, Christian Schaefer, Dominik Achten, Florian Auer, Ariane Boehm, Tatjana Braun, Maximilian Hecht, Mark Heron, Peter Hönigschmid (+8 others)
2013 BMC Bioinformatics  
doi:10.1186/1471-2105-14-s3-s7 pmid:23514582 pmcid:PMC3584931 fatcat:gwjjghihvrffdluv6lvlyzqoqu

A large-scale evaluation of computational protein function prediction

Predrag Radivojac, Wyatt T Clark, Tal Ronnen Oron, Alexandra M Schnoes, Tobias Wittkop, Artem Sokolov, Kiley Graim, Christopher Funk, Karin Verspoor, Asa Ben-Hur, Gaurav Pandey, Jeffrey M Yunes (+90 others)
2013 Nature Methods  
Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. if computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. here we report the results from the first large-scale community-based critical assessment of protein function annotation (CAFA) experiment. Fifty-four methods representing the state of the art for
more » ... rotein function prediction were evaluated on a target set of 866 proteins from 11 organisms. two findings stand out: (i) today's best protein function prediction algorithms substantially outperform widely used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is considerable need for improvement of currently available tools. The accurate annotation of protein function is key to understanding life at the molecular level and has great biomedical and pharmaceutical implications. However, with its inherent difficulty and expense, experimental characterization of function cannot scale up to accommodate the vast amount of sequence data already
doi:10.1038/nmeth.2340 pmid:23353650 pmcid:PMC3584181 fatcat:7rqgsz4wgfa45lb4iirt2sa4ai

Genome Landscapes of Disease: Strategies to Predict the Phenotypic Consequences of Human Germline and Somatic Variation

Rachel Karchin, Ruth Nussinov
2016 PLoS Computational Biology  
Jonas Reeb, Maximilian Hecht, Yannick Mahlich, Yana Bromberg, and Burkhard Rost describe predictions of molecular effects of sequence variants to bridge the gap from the micro level of molecular function  ... 
doi:10.1371/journal.pcbi.1005043 pmid:27536867 pmcid:PMC4990343 fatcat:tw7gtnb4gvfktfpgejkoralrha

Snow microbiome functional analyses reveal novel aspects of microbial metabolism of complex organic compounds

Chengsheng Zhu, Maximilian Miller, Nicholas Lusskin, Benoît Bergk Pinto, Lorrie Maccario, Max Häggblom, Timothy Vogel, Catherine Larose, Yana Bromberg
2020 MicrobiologyOpen  
Yannick Mahlich, Yanran Wang, and Zishuo Zeng (Rutgers University) for the useful discussion and suggestions.  ...  metatranscriptomes under different conditions highlights the key microbial members and their molecular functions that result from and/or contribute to niche differences (Zhu, Delmont, Vogel, & Bromberg, 2015; Zhu, Mahlich  ... 
doi:10.1002/mbo3.1100 pmid:32762019 fatcat:wbad2vc62vg7jpal7bacdy5m5u

Snow Microbiome Functional Analyses Reveal Novel Microbial Metabolism of Complex Organic Compounds

Chengsheng Zhu, Maximilian Miller, Nicholas Lusskin, Benoit Bergk Pinto, Lorrie Maccario, Max Haggblom, Timothy Vogel, Catherine Larose, Yana Bromberg
2020 biorxiv/medrxiv  
Yannick Mahlich, Yanran Wang and Zishuo Zeng (Rutgers University) for the useful discussion and suggestions. Conflicts of Interest: The authors declare no conflict of interest.  ... 
doi:10.1101/2020.02.07.938555 fatcat:vmmrluz4g5ccdcxslr3xb3klx4

fuNTRp: Identifying protein positions for variation driven functional tuning [article]

M Miller, D Vitale, B Rost, Y Bromberg
2019 bioRxiv   pre-print
Chengsheng Zhu, Yannick Mahlich, Yanran Wang, and Zishuo Zeng (all Rutgers) for all discussions and to Dr. Sonakshi Bhattacharjee (Columbia) for help with the manuscript.  ...  (Enzyme Commission) numbers, compiled as in (Mahlich, et al., 2018) . (2) We extracted all human enzymes with catalytic site annotations from the M-CSA database (Ribeiro, et al., 2018) and retained  ...  On the other hand, most Swiss-Prot EC annotations are annotated via function transfer by homology -a process (and some error in it (Mahlich, et al., 2018; Schnoes, et al., 2009 ) that ensure overrepresentation  ... 
doi:10.1101/578757 fatcat:yvay3o44sfbkpmr6vqmte4225u
« Previous Showing results 1 — 15 out of 28 results