A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is application/pdf
.
Filters
The Pfam protein families database in 2019
2018
Nucleic Acids Research
We carried out a significant comparison to the structural classification database, namely the Evolutionary Classification of Protein Domains (ECOD) that led to the creation of 825 new families based on ...
The last few years have witnessed significant changes in Pfam (https://pfam.xfam.org). The number of families has grown substantially to a total of 17,929 in release 32.0. ...
ACKNOWLEDGEMENTS We would like to thank the various members of the EMBL-European Bioinformatics Institute training team for their assistance in developing the online training materials. ...
doi:10.1093/nar/gky995
pmid:30357350
pmcid:PMC6324024
fatcat:q5wtvxfqzzhobpis5p4d2g6wue
Phylogenetic and Evolutionary Analysis of the Late Embryogenesis Abundant (LEA) Gene Product in Poaceae
2021
Journal of genetic resources
The full-length LEA protein sequences were acquired by performing the sequence search of sequenced hva1 against Poaceae species in the non-redundant protein database by a BlastX search tool. ...
Our data will provide novel insights for further studies of the Late Embryogenesis Abundant protein family in Poaceae. ...
Acknowledgments This work has been supported by the University of Zabol in Grant code: UOZ-GR-9618-158 ...
doi:10.22080/jgr.2021.20336.1229
doaj:a0c00d5c021848889ba703381fdc9295
fatcat:zbeve6bdvbc4pewcttdme6o2ea
PyFuncover: full proteome search for a specific function using BLAST and PFAM
2019
EMBnet journal
The pipeline coded in python uses BLAST alignment and the sequences from a PFAM family as the search seed. ...
We tested PyFuncover using the fatty acid-binding family (FABP) Lipocalin_7 from PFAM (version 32, 2019) against the Homo sapiens NCBI proteome. ...
Kossida and by the ATIP-Avenir program to M. Moriel-Carretero. ...
doi:10.14806/ej.24.0.925
fatcat:gdjj6jf63bdbja4vczqwsthyrq
Mapping OMIM Disease–Related Variations on Protein Domains Reveals an Association Among Variation Type, Pfam Models, and Disease Classes
2021
Frontiers in Molecular Biosciences
In this study, we investigate the occurrence and distribution of human disease–related variations in the context of Pfam domains. ...
Human genome resequencing projects provide an unprecedented amount of data about single-nucleotide variations occurring in protein-coding regions and often leading to observable changes in the covalent ...
All authors contributed to the article and approved the submitted version. ...
doi:10.3389/fmolb.2021.617016
pmid:34026820
pmcid:PMC8138129
fatcat:ckulfahfr5fqlmw6c5vl4i2oxy
Using Deep Learning to Annotate the Protein Universe
[article]
2019
bioRxiv
pre-print
Using 80% of the full Pfam database we train a protein family predictor that is more accurate and over 200 times faster than BLASTp, while learning sequence features it was not trained on such as structural ...
To address this, we report a deep learning model that learns the relationship between unaligned amino acid sequences and their functional classification across all 17929 families of the Pfam database. ...
HMMs we use the highly curated Protein families (Pfam) database [9, 14] . ...
doi:10.1101/626507
fatcat:h5vnd4wkkbccjfvccxsbrzvrcq
MOESM3 of Fueling ab initio folding with marine metagenomics enables structure and function predictions of new protein families
2019
Figshare
(A) using IMG database with 614 Pfam families; (B) using Tara Oceans with 27 Pfam families. ...
First, the 27 Pfam families discussed in the paper do not represent the total number of proteins that are modellable using our pipeline. ...
Response: Since our purpose of the study was to examine the potential of our pipeline, when combined with the Tara Oceans database, for modeling new Pfam families, we have selected to skip the families ...
doi:10.6084/m9.figshare.10184990.v1
fatcat:vry3ese2xrgavbh7na2ujcfuau
Rapid identification of novel protein families using similarity searches
2018
F1000Research
Testing this method with the Pfam protein family database, we are able to compare potential new families to the over 17,000 existing families in Pfam in less than a second, with little loss in accuracy ...
Protein family databases are an important tool for biologists trying to dissect the function of proteins. ...
as yet unclassified in the protein family database. ...
doi:10.12688/f1000research.17315.1
pmid:30984369
pmcid:PMC6439793
fatcat:lhofvmydgbgzxgdwe2kp7phjne
PathFams: statistical detection of pathogen-associated protein domains
2021
BMC Genomics
Results To facilitate virulence factor discovery, we performed a comprehensive analysis of 17,929 protein domain families within the Pfam database, and scored them based on their overrepresentation in ...
Conclusions We identify pathogen-associated domain families, candidate virulence factors in the human gut, and eukaryotic-like mimicry domains with likely roles in virulence. ...
Abundance and taxonomic breadth The NCBI sequence database domain alignments were sourced from.www.ftp.ebi.ac.uk/pub/databases/Pfam/ current_release/Pfam-A.full.ncbi (Pfam v.32.0; retrieved Feb. 9, 2019 ...
doi:10.1186/s12864-021-07982-8
pmid:34521345
fatcat:onyzqp6ehbexfodvveikrjzloa
Folding non-homologous proteins by coupling deep-learning contact maps with I-TASSER assembly simulations
2021
Cell Reports Methods
When applied to a folding experiment on 8,266 unsolved Pfam families, C-I-TASSER successfully folded 4,162 domain families, including 504 folds that are not found in the PDB. ...
Structure prediction for proteins lacking homologous templates in the Protein Data Bank (PDB) remains a significant unsolved problem. ...
This work is supported in part by the NIGMS (GM136422 and S10OD026825), the NIAID (AI134678), and the NSF (IIS1901191, DBI2030790, and MTM2025426). ...
doi:10.1016/j.crmeth.2021.100014
pmid:34355210
pmcid:PMC8336924
fatcat:arhhxqtquvhr3o7xnzxbcdnlme
MOESM1 of Fueling ab initio folding with marine metagenomics enables structure and function predictions of new protein families
2019
Figshare
A breakdown of the Pfam families based on different metagenome database searches. Figure S4. ...
Taxonomical distribution of all the genera in the Pfam families that are modellable using different sequence samples. Figure S5. ...
S5 Species distribution for 797 Pfam families modeled with the combined Tara and MetaClust dataset. (A) Species distribution for 797 Pfam families based on the record in Pfam database. ...
doi:10.6084/m9.figshare.10184972
fatcat:yjdekzrsbzhjzg45ydaw35ckeu
TASmania: A bacterial Toxin-Antitoxin Systems database
2019
PLoS Computational Biology
We have herein developed a new in silico discovery pipeline named TASmania, which mines the >41K assemblies of the EnsemblBacteria database for known and uncharacterized protein components of type I to ...
TAs families. ...
Acknowledgments Computing resources of Vital-IT group of the Swiss Institute of Bioinformatics and the Interfaculty Bioinformatics Unit of University of Fribourg and University of Bern were used. ...
doi:10.1371/journal.pcbi.1006946
pmid:31022176
pmcid:PMC6504116
fatcat:rrziwojgenb5zdotdzpdro3csm
Molecular replacement using structure predictions from databases
2019
Acta Crystallographica Section D: Structural Biology
Covariance-assisted ab initio models representing structurally uncharacterized Pfam families are now available on a large scale in databases, potentially representing a valuable and easily accessible supplement ...
Here, the unconventional MR pipeline AMPLE is employed to explore the value of structure predictions in the GREMLIN and PconsFam databases. ...
The PconsFam database (Lamb et al., 2019) contains singlestructure predictions for 13 617 proteins, again each representing a Pfam family. ...
doi:10.1107/s2059798319013962
pmid:31793899
pmcid:PMC6889911
fatcat:p7mpzvyi6nfk3jq7jwemqlhc3i
Digging with Experimental Pick and Computational Shovel: a New Addition to the Histidine Kinase Superfamily
2003
Journal of Bacteriology
Figure 1 shows that scanning a protein SMa2063 from Sinorhizobium meliloti, a member of the newly identified HWE family (10), against the SMART database results in a prediction of "protein of unknown ...
Profile hidden Markov models (HMMs) were designed for these domains, enabling their rapid detection and visualization in protein sequences in two primary domain databases, Pfam (2) and SMART (11) . ...
They can carry out online annotation of proteins they study, link them to the appropriate peer-reviewed publications, and communicate their suggestions on database improvements directly to the database ...
doi:10.1128/jb.186.2.267-269.2004
pmid:14702293
pmcid:PMC305774
fatcat:f4rowe5vufcffenghgpbqdqvji
Automatic Prediction and Annotation: There Are Strong Biases for Multigenic Families
2021
Frontiers in Genetics
ACKNOWLEDGMENTS The authors are thankful to the Paul Sabatier-Toulouse 3 University and to the Center National de la Recherche Scientifique (CNRS) for granting their work. The authors also thank Dr. ...
CONCLUSION Expert annotations for large protein families and dedicated databases with manually verified proteins used as reference for prediction and annotation of additional genes are the solution. ...
Currently, experts are already available for 166 families from The Arabidopsis Information Resource (TAIR) (https://www. arabidopsis.org/browse/genefamily/) and a few databases are dedicated to protein ...
doi:10.3389/fgene.2021.697477
pmid:34603370
pmcid:PMC8481831
fatcat:4lszaefffffnngoqkjy63ggyi4
Fueling ab initio folding with marine metagenomics enables structure and function predictions of new protein families
2019
Genome Biology
Using recent advances in marine genomics, we explore new applications of oceanic metagenomes for protein structure and function prediction. ...
The ocean microbiome represents one of the largest microbiomes and produces nearly half of the primary energy on the planet through photosynthesis or chemosynthesis. ...
Wei Zheng for the technical assistance in IMG/M data preparation. ...
doi:10.1186/s13059-019-1823-z
pmid:31676016
pmcid:PMC6825341
fatcat:dirks5rtsvee7k2vcxz2uvo76m
« Previous
Showing results 1 — 15 out of 7,358 results