3,907 Hits in 3.9 sec

Pfam: The protein families database in 2021

Jaina Mistry, Sara Chuguransky, Lowri Williams, Matloob Qureshi, Gustavo A Salazar, Erik L L Sonnhammer, Silvio C E Tosatto, Lisanna Paladin, Shriya Raj, Lorna J Richardson, Robert D Finn, Alex Bateman
2020 Nucleic Acids Research  
The Pfam database is a widely used resource for classifying protein sequences into families and domains.  ...  We have compared all of the regions in the RepeatsDB to those in Pfam and have started to use the results to build and refine Pfam repeat families.  ...  We are grateful to Layla Hirsh Martinez and Aleix Lafita for adding families to Pfam and to the numerous scientists who contributed suggestions and families via our helpdesk. FUNDING  ... 
doi:10.1093/nar/gkaa913 pmid:33125078 pmcid:PMC7779014 fatcat:qhulduagfrg2jj7n25n5sppaue

Mapping OMIM Disease–Related Variations on Protein Domains Reveals an Association Among Variation Type, Pfam Models, and Disease Classes

Castrense Savojardo, Giulia Babbi, Pier Luigi Martelli, Rita Casadio
2021 Frontiers in Molecular Biosciences  
In this study, we investigate the occurrence and distribution of human disease–related variations in the context of Pfam domains.  ...  Human genome resequencing projects provide an unprecedented amount of data about single-nucleotide variations occurring in protein-coding regions and often leading to observable changes in the covalent  ...  All authors contributed to the article and approved the submitted version.  ... 
doi:10.3389/fmolb.2021.617016 pmid:34026820 pmcid:PMC8138129 fatcat:ckulfahfr5fqlmw6c5vl4i2oxy

Density Peak clustering of protein sequences associated to a Pfam clan reveals clear similarities and interesting differences with respect to manual family annotation

Elena Tea Russo, Alessandro Laio, Marco Punta
2021 BMC Bioinformatics  
Pfam is possibly the most well known protein family database, built in many years of work by domain experts with extensive use of manual curation.  ...  Background The identification of protein families is of outstanding practical importance for in silico protein annotation and is at the basis of several bioinformatic resources.  ...  Since not all sequences in UniRef50 are annotated in Pfam, we are not able to use the Pfam database family assignments directly.  ... 
doi:10.1186/s12859-021-04013-x pmid:33711918 fatcat:wqvfaqj7wvcfrjueutkjmip2yy

Decoding microbiome and protein family linkage to improve protein structure prediction [article]

Pengshuo Yang, Wei Zheng, Kang Ning, Yang Zhang
2021 bioRxiv   pre-print
These results revealed the important link of biomes with protein families and provided a useful bluebook to guide future microbiome sequence database and modeling development for protein structure and  ...  Large-scale protein family folding experiments showed that a targeted approach using predicted biomes significantly outperform combined metagenome datasets in both speed of MSA collection and accuracy  ...  ith biome dataset, and is the number of homologous sequences in the jth family from the Pfam database.  ... 
doi:10.1101/2021.04.15.440088 fatcat:vbbuv6cokrbblk7rulnifhirjq

Phylogenetic and Evolutionary Analysis of the Late Embryogenesis Abundant (LEA) Gene Product in Poaceae

Darush Choobineh, Nafiseh Mahdinezhad, Ali Niazi, Baratali Fakheri, Abbasali Emamjomeh
2021 Journal of genetic resources  
The full-length LEA protein sequences were acquired by performing the sequence search of sequenced hva1 against Poaceae species in the non-redundant protein database by a BlastX search tool.  ...  Our data will provide novel insights for further studies of the Late Embryogenesis Abundant protein family in Poaceae.  ...  Acknowledgments This work has been supported by the University of Zabol in Grant code: UOZ-GR-9618-158  ... 
doi:10.22080/jgr.2021.20336.1229 doaj:a0c00d5c021848889ba703381fdc9295 fatcat:zbeve6bdvbc4pewcttdme6o2ea

PathFams: statistical detection of pathogen-associated protein domains

Briallen Lobb, Benjamin Jean-Marie Tremblay, Gabriel Moreno-Hagelsieb, Andrew C. Doxey
2021 BMC Genomics  
Results To facilitate virulence factor discovery, we performed a comprehensive analysis of 17,929 protein domain families within the Pfam database, and scored them based on their overrepresentation in  ...  Conclusions We identify pathogen-associated domain families, candidate virulence factors in the human gut, and eukaryotic-like mimicry domains with likely roles in virulence.  ...  We anticipate that, similar to these cases, other candidate virulence factors may be identified Conclusions In this work, we analyzed all 17,929 protein domain families in the Pfam v32.0 database in  ... 
doi:10.1186/s12864-021-07982-8 pmid:34521345 fatcat:onyzqp6ehbexfodvveikrjzloa

The Flo Adhesin Family

Ronnie G. Willaert, Yeseren Kayacan, Bart Devreese
2021 Pathogens  
Finally, we identified from Pfam database datamining yeasts that could express Flo adhesins and are encountered in human infections and their adhesin architectures.  ...  One of the main players involved in this are the expressed cell wall adhesins. Here, we review the Flo adhesin family and their involvement in the adhesion of these yeasts during human infections.  ...  Acknowledgments: The Research Council of the Vrije Universiteit Brussel (VUB) (Belgium) and the University of Gent (Belgium) are acknowledged to support the Alliance Research Group VUB-UGent NanoMicrobiology  ... 
doi:10.3390/pathogens10111397 pmid:34832553 pmcid:PMC8621652 fatcat:zviivbj4ujdrjkvwjc73adx5eu

DomainViz: intuitive visualization of consensus domain distributions across groups of proteins

Pascal Schläpfer, Devang Mehta, Cameron Ridderikhoff, R Glen Uhrig
2021 Nucleic Acids Research  
Currently, DomainViz uses the well-established PFAM and Prosite databases for domain searching and assembles intuitive, publication-ready 'monument valley' plots (mv-plots) that display the extent of domain  ...  The prediction of functional domains is typically among the first steps towards understanding the function of new proteins and protein families.  ...  ACKNOWLEDGEMENTS The authors thank John Bartoszewski and Broderick Wood of the Faculty of Science Research IT team (University of Alberta) for their assistance in setting up the web server, Mohamad Jamaleddine  ... 
doi:10.1093/nar/gkab391 pmid:34023887 fatcat:fuv5dniizvet7fauha23oaespe

Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations

Wei Zheng, Chengxin Zhang, Yang Li, Robin Pearce, Eric W. Bell, Yang Zhang
2021 Cell Reports Methods  
When applied to a folding experiment on 8,266 unsolved Pfam families, C-I-TASSER successfully folded 4,162 domain families, including 504 folds that are not found in the PDB.  ...  Structure prediction for proteins lacking homologous templates in the Protein Data Bank (PDB) remains a significant unsolved problem.  ...  This work is supported in part by the NIGMS (GM136422 and S10OD026825), the NIAID (AI134678), and the NSF (IIS1901191, DBI2030790, and MTM2025426).  ... 
doi:10.1016/j.crmeth.2021.100014 pmid:34355210 pmcid:PMC8336924 fatcat:arhhxqtquvhr3o7xnzxbcdnlme

Genome Wide Characterization & Phylogenetic Analysis of TNF Genes in Homo Sapiens

Saif S, Department of Bioinformatics and Biotechnology, Government Collage University, 38000 Faisalabad, Pakistan, Mazhar MW, Sikandar M, Waqas N, Mahmood J, Aslam H, Abaidullah M, Ijaz A, Department of Bioinformatics and Biotechnology, Government Collage University, 38000 Faisalabad, Pakistan, Department of Bioinformatics and Biotechnology, Government Collage University, 38000 Faisalabad, Pakistan, Department of Bioinformatics and Biotechnology, Government Collage University, 38000 Faisalabad, Pakistan (+4 others)
2021 Austin Journal of Pharmacology and Therapeutics  
In this study the genome wide identification of TNF gene was done. Different tools and databases were used.  ...  A number of TNF receptors mediated factors have been identified having a major role in signal transduction pathways of TNF gene family.  ...  The exon-intron structure of the TNF gene family was found to be very similar in this study. However, there was a wide range of gene distribution across chromosomes.  ... 
doi:10.26420/austinjpharmacolther.2021.1134 fatcat:4is3mneo6bewraxqgnjrehd55q

Automatic Prediction and Annotation: There Are Strong Biases for Multigenic Families

Catherine Mathé, Christophe Dunand
2021 Frontiers in Genetics  
ACKNOWLEDGMENTS The authors are thankful to the Paul Sabatier-Toulouse 3 University and to the Center National de la Recherche Scientifique (CNRS) for granting their work. The authors also thank Dr.  ...  All PFAM descriptions are available from https://pfam. (Mistry et al., 2021). September 2021 | Volume 12 | Article 697477 Frontiers in Genetics |  ...  CONCLUSION Expert annotations for large protein families and dedicated databases with manually verified proteins used as reference for prediction and annotation of additional genes are the solution.  ... 
doi:10.3389/fgene.2021.697477 pmid:34603370 pmcid:PMC8481831 fatcat:4lszaefffffnngoqkjy63ggyi4

Discovery of fibrillar adhesins across bacterial species

Vivian Monzon, Aleix Lafita, Alex Bateman
2021 BMC Genomics  
Based on the presence of these domains in the UniProt Reference Proteomes database, we identified and analysed 3,542 fibrillar adhesin-like proteins across species of the most common bacterial phyla.  ...  the protein, while their positions are more variable in Gram negative bacteria.  ...  In each case we identified the relevant entries in the Pfam database and recorded the relevant identifiers.  ... 
doi:10.1186/s12864-021-07586-2 pmid:34275445 pmcid:PMC8286594 fatcat:yhm4f6tasjerrmmuodu6ccmoqy

BonMOLière: Small-Sized Libraries of Readily Purchasable Compounds, Optimized to Produce Genuine Hits in Biological Screens across the Protein Space

Neann Mathai, Conrad Stork, Johannes Kirchmair
2021 International Journal of Molecular Sciences  
The best of the optimized compound libraries prepared in this work are available for download as a dataset bundle ("BonMOLière").  ...  Therefore, small to medium-sized compound libraries with a high chance of producing genuine hits on an arbitrary protein of interest would be of great value to fields related to early drug discovery, in  ...  A portion of the calculations described in this work were performed on resources provided by UNINETT Sigma2-the National Infrastructure for High Performance Computing and Data Storage in Norway.  ... 
doi:10.3390/ijms22157773 fatcat:nwm7l5xuabbkdn45mzk2zddmna

eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic Scale [article]

Carlos P Cantalapiedra, Ana Hernandez-Plaza, Ivica Letunic, Peer Bork, Jaime Huerta-Cepas
2021 bioRxiv   pre-print
Improvements in version 2 include a full update of both the genomes and functional databases underlying eggNOG v5, as well as several efficiency enhancements and new features.  ...  Most notably, eggNOG-mapper v2 now allows: (i) de novo gene prediction from raw contigs, (ii) built-in pairwise orthology prediction, (iii) fast protein domain discovery, and (iv) automated GFF decoration  ...  Protein domain annotations Along with the functional terms annotated per query, this new version of eggNOG-mapper provides PFAM (Mistry et al. 2020 ) and SMART (Letunic et al. 2021 ) protein domain predictions  ... 
doi:10.1101/2021.06.03.446934 fatcat:do4oh4xcl5cerm73pmakdwlfbi

RepeatsDB in 2021: improved data and extended classification for protein tandem repeat structures

Lisanna Paladin, Martina Bevilacqua, Sara Errigo, Damiano Piovesan, Ivan Mičetić, Marco Necci, Alexander Miguel Monzon, Maria Laura Fabre, Jose Luis Lopez, Juliet F Nilsson, Javier Rios, Pablo Lorenzano Menna (+11 others)
2020 Nucleic Acids Research  
(Clan > Family) requiring sequence similarity and describing repeat motifs in collaboration with Pfam.  ...  Protein tandem repeats are ubiquitous in all branches of the tree of life.  ...  ACKNOWLEDGEMENTS RepeatsDB is a service of ELIXIR-IIB (, the Italian Node of the European ELIXIR infrastructure for biological data (  ... 
doi:10.1093/nar/gkaa1097 pmid:33237313 fatcat:l7r6wzdo6rcm5k7cbadrttboha
« Previous Showing results 1 — 15 out of 3,907 results