7,358 Hits in 3.2 sec

The Pfam protein families database in 2019

Sara El-Gebali, Jaina Mistry, Alex Bateman, Sean R Eddy, Aurélien Luciani, Simon C Potter, Matloob Qureshi, Lorna J Richardson, Gustavo A Salazar, Alfredo Smart, Erik L L Sonnhammer, Layla Hirsh (+4 others)
2018 Nucleic Acids Research  
We carried out a significant comparison to the structural classification database, namely the Evolutionary Classification of Protein Domains (ECOD) that led to the creation of 825 new families based on  ...  The last few years have witnessed significant changes in Pfam ( The number of families has grown substantially to a total of 17,929 in release 32.0.  ...  ACKNOWLEDGEMENTS We would like to thank the various members of the EMBL-European Bioinformatics Institute training team for their assistance in developing the online training materials.  ... 
doi:10.1093/nar/gky995 pmid:30357350 pmcid:PMC6324024 fatcat:q5wtvxfqzzhobpis5p4d2g6wue

Phylogenetic and Evolutionary Analysis of the Late Embryogenesis Abundant (LEA) Gene Product in Poaceae

Darush Choobineh, Nafiseh Mahdinezhad, Ali Niazi, Baratali Fakheri, Abbasali Emamjomeh
2021 Journal of genetic resources  
The full-length LEA protein sequences were acquired by performing the sequence search of sequenced hva1 against Poaceae species in the non-redundant protein database by a BlastX search tool.  ...  Our data will provide novel insights for further studies of the Late Embryogenesis Abundant protein family in Poaceae.  ...  Acknowledgments This work has been supported by the University of Zabol in Grant code: UOZ-GR-9618-158  ... 
doi:10.22080/jgr.2021.20336.1229 doaj:a0c00d5c021848889ba703381fdc9295 fatcat:zbeve6bdvbc4pewcttdme6o2ea

PyFuncover: full proteome search for a specific function using BLAST and PFAM

Yoan Bouzin, Benjamin Thomas Viart, María Moriel-Carretero, Sofia Kossida
2019 EMBnet journal  
The pipeline coded in python uses BLAST alignment and the sequences from a PFAM family as the search seed.  ...  We tested PyFuncover using the fatty acid-binding family (FABP) Lipocalin_7 from PFAM (version 32, 2019) against the Homo sapiens NCBI proteome.  ...  Kossida and by the ATIP-Avenir program to M. Moriel-Carretero.  ... 
doi:10.14806/ej.24.0.925 fatcat:gdjj6jf63bdbja4vczqwsthyrq

Mapping OMIM Disease–Related Variations on Protein Domains Reveals an Association Among Variation Type, Pfam Models, and Disease Classes

Castrense Savojardo, Giulia Babbi, Pier Luigi Martelli, Rita Casadio
2021 Frontiers in Molecular Biosciences  
In this study, we investigate the occurrence and distribution of human disease–related variations in the context of Pfam domains.  ...  Human genome resequencing projects provide an unprecedented amount of data about single-nucleotide variations occurring in protein-coding regions and often leading to observable changes in the covalent  ...  All authors contributed to the article and approved the submitted version.  ... 
doi:10.3389/fmolb.2021.617016 pmid:34026820 pmcid:PMC8138129 fatcat:ckulfahfr5fqlmw6c5vl4i2oxy

Using Deep Learning to Annotate the Protein Universe [article]

Maxwell L. Bileschi, David Belanger, Drew Bryant, Theo Sanderson, Brandon Carter, D. Sculley, Mark A. DePristo, Lucy J. Colwell
2019 bioRxiv   pre-print
Using 80% of the full Pfam database we train a protein family predictor that is more accurate and over 200 times faster than BLASTp, while learning sequence features it was not trained on such as structural  ...  To address this, we report a deep learning model that learns the relationship between unaligned amino acid sequences and their functional classification across all 17929 families of the Pfam database.  ...  HMMs we use the highly curated Protein families (Pfam) database [9, 14] .  ... 
doi:10.1101/626507 fatcat:h5vnd4wkkbccjfvccxsbrzvrcq

MOESM3 of Fueling ab initio folding with marine metagenomics enables structure and function predictions of new protein families

Yan Wang, Qiang Shi, Pengshuo Yang, Chengxin Zhang, S. Mortuza, Zhidong Xue, Kang Ning, Yang Zhang
2019 Figshare  
(A) using IMG database with 614 Pfam families; (B) using Tara Oceans with 27 Pfam families.  ...  First, the 27 Pfam families discussed in the paper do not represent the total number of proteins that are modellable using our pipeline.  ...  Response: Since our purpose of the study was to examine the potential of our pipeline, when combined with the Tara Oceans database, for modeling new Pfam families, we have selected to skip the families  ... 
doi:10.6084/m9.figshare.10184990.v1 fatcat:vry3ese2xrgavbh7na2ujcfuau

Rapid identification of novel protein families using similarity searches

Matt Jeffryes, Alex Bateman
2018 F1000Research  
Testing this method with the Pfam protein family database, we are able to compare potential new families to the over 17,000 existing families in Pfam in less than a second, with little loss in accuracy  ...  Protein family databases are an important tool for biologists trying to dissect the function of proteins.  ...  as yet unclassified in the protein family database.  ... 
doi:10.12688/f1000research.17315.1 pmid:30984369 pmcid:PMC6439793 fatcat:lhofvmydgbgzxgdwe2kp7phjne

PathFams: statistical detection of pathogen-associated protein domains

Briallen Lobb, Benjamin Jean-Marie Tremblay, Gabriel Moreno-Hagelsieb, Andrew C. Doxey
2021 BMC Genomics  
Results To facilitate virulence factor discovery, we performed a comprehensive analysis of 17,929 protein domain families within the Pfam database, and scored them based on their overrepresentation in  ...  Conclusions We identify pathogen-associated domain families, candidate virulence factors in the human gut, and eukaryotic-like mimicry domains with likely roles in virulence.  ...  Abundance and taxonomic breadth The NCBI sequence database domain alignments were sourced current_release/Pfam-A.full.ncbi (Pfam v.32.0; retrieved Feb. 9, 2019  ... 
doi:10.1186/s12864-021-07982-8 pmid:34521345 fatcat:onyzqp6ehbexfodvveikrjzloa

Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations

Wei Zheng, Chengxin Zhang, Yang Li, Robin Pearce, Eric W. Bell, Yang Zhang
2021 Cell Reports Methods  
When applied to a folding experiment on 8,266 unsolved Pfam families, C-I-TASSER successfully folded 4,162 domain families, including 504 folds that are not found in the PDB.  ...  Structure prediction for proteins lacking homologous templates in the Protein Data Bank (PDB) remains a significant unsolved problem.  ...  This work is supported in part by the NIGMS (GM136422 and S10OD026825), the NIAID (AI134678), and the NSF (IIS1901191, DBI2030790, and MTM2025426).  ... 
doi:10.1016/j.crmeth.2021.100014 pmid:34355210 pmcid:PMC8336924 fatcat:arhhxqtquvhr3o7xnzxbcdnlme

MOESM1 of Fueling ab initio folding with marine metagenomics enables structure and function predictions of new protein families

Yan Wang, Qiang Shi, Pengshuo Yang, Chengxin Zhang, S. Mortuza, Zhidong Xue, Kang Ning, Yang Zhang
2019 Figshare  
A breakdown of the Pfam families based on different metagenome database searches. Figure S4.  ...  Taxonomical distribution of all the genera in the Pfam families that are modellable using different sequence samples. Figure S5.  ...  S5 Species distribution for 797 Pfam families modeled with the combined Tara and MetaClust dataset. (A) Species distribution for 797 Pfam families based on the record in Pfam database.  ... 
doi:10.6084/m9.figshare.10184972 fatcat:yjdekzrsbzhjzg45ydaw35ckeu

TASmania: A bacterial Toxin-Antitoxin Systems database

Hatice Akarsu, Patricia Bordes, Moise Mansour, Donna-Joe Bigot, Pierre Genevaux, Laurent Falquet, Mark M. Tanaka
2019 PLoS Computational Biology  
We have herein developed a new in silico discovery pipeline named TASmania, which mines the >41K assemblies of the EnsemblBacteria database for known and uncharacterized protein components of type I to  ...  TAs families.  ...  Acknowledgments Computing resources of Vital-IT group of the Swiss Institute of Bioinformatics and the Interfaculty Bioinformatics Unit of University of Fribourg and University of Bern were used.  ... 
doi:10.1371/journal.pcbi.1006946 pmid:31022176 pmcid:PMC6504116 fatcat:rrziwojgenb5zdotdzpdro3csm

Molecular replacement using structure predictions from databases

Adam J. Simpkin, Jens M. H. Thomas, Felix Simkovic, Ronan M. Keegan, Daniel J. Rigden
2019 Acta Crystallographica Section D: Structural Biology  
Covariance-assisted ab initio models representing structurally uncharacterized Pfam families are now available on a large scale in databases, potentially representing a valuable and easily accessible supplement  ...  Here, the unconventional MR pipeline AMPLE is employed to explore the value of structure predictions in the GREMLIN and PconsFam databases.  ...  The PconsFam database (Lamb et al., 2019) contains singlestructure predictions for 13 617 proteins, again each representing a Pfam family.  ... 
doi:10.1107/s2059798319013962 pmid:31793899 pmcid:PMC6889911 fatcat:p7mpzvyi6nfk3jq7jwemqlhc3i

Digging with Experimental Pick and Computational Shovel: a New Addition to the Histidine Kinase Superfamily

I. B. Zhulin
2003 Journal of Bacteriology  
Figure 1 shows that scanning a protein SMa2063 from Sinorhizobium meliloti, a member of the newly identified HWE family (10), against the SMART database results in a prediction of "protein of unknown  ...  Profile hidden Markov models (HMMs) were designed for these domains, enabling their rapid detection and visualization in protein sequences in two primary domain databases, Pfam (2) and SMART (11) .  ...  They can carry out online annotation of proteins they study, link them to the appropriate peer-reviewed publications, and communicate their suggestions on database improvements directly to the database  ... 
doi:10.1128/jb.186.2.267-269.2004 pmid:14702293 pmcid:PMC305774 fatcat:f4rowe5vufcffenghgpbqdqvji

Automatic Prediction and Annotation: There Are Strong Biases for Multigenic Families

Catherine Mathé, Christophe Dunand
2021 Frontiers in Genetics  
ACKNOWLEDGMENTS The authors are thankful to the Paul Sabatier-Toulouse 3 University and to the Center National de la Recherche Scientifique (CNRS) for granting their work. The authors also thank Dr.  ...  CONCLUSION Expert annotations for large protein families and dedicated databases with manually verified proteins used as reference for prediction and annotation of additional genes are the solution.  ...  Currently, experts are already available for 166 families from The Arabidopsis Information Resource (TAIR) (https://www. and a few databases are dedicated to protein  ... 
doi:10.3389/fgene.2021.697477 pmid:34603370 pmcid:PMC8481831 fatcat:4lszaefffffnngoqkjy63ggyi4

Fueling ab initio folding with marine metagenomics enables structure and function predictions of new protein families

Yan Wang, Qiang Shi, Pengshuo Yang, Chengxin Zhang, S. M. Mortuza, Zhidong Xue, Kang Ning, Yang Zhang
2019 Genome Biology  
Using recent advances in marine genomics, we explore new applications of oceanic metagenomes for protein structure and function prediction.  ...  The ocean microbiome represents one of the largest microbiomes and produces nearly half of the primary energy on the planet through photosynthesis or chemosynthesis.  ...  Wei Zheng for the technical assistance in IMG/M data preparation.  ... 
doi:10.1186/s13059-019-1823-z pmid:31676016 pmcid:PMC6825341 fatcat:dirks5rtsvee7k2vcxz2uvo76m
« Previous Showing results 1 — 15 out of 7,358 results