Filters








140,367 Hits in 3.0 sec

Domains, motifs and clusters in the protein universe

Jinfeng Liu, Burkhard Rost
2003 Current Opinion in Chemical Biology  
Last, not least, thanks to all those who deposit their experimental data in public databases, and to those who maintain these databases.  ...  JL and BR were supported by the grants 1-P50-GM62413-01 and RO1-GM63029-01 from the National Institute of Health (NIH).  ...  None of these is well enough established yet for large-scale sequence analysis.  ... 
doi:10.1016/s1367-5931(02)00003-0 pmid:12547420 fatcat:3etuaqalbnbr5hxzdbajd3cdrq

A New Method for Database Searching and Clustering

Antje Krause, Martin Vingron
1997 Genome Informatics Series  
In practice we achieve unambiguous assignment of 80% of Swiss-Prot sequences to non-overlapping sequence clusters in an entirely automatic fashion.  ...  The search method virtually never produces false positive hits while determining meaningfully large sets of sequences related to the query.  ...  Acknowledgments We thank Marc Rehmsmeier for the creation of the World Wide Web interface.  ... 
doi:10.11234/gi1990.8.90 fatcat:bekpxrkiwbaj3gvasp45t7pj7a

Functional annotation prediction: All for one and one for all

Ori Sasson, Noam Kaplan, Michal Linial
2006 Protein Science  
ProtoNet is a hierarchical organization of the protein sequences in the UniProt database.  ...  In an era of rapid genome sequencing and high-throughput technology, automatic function prediction for a novel sequence is of utter importance in bioinformatics.  ...  Acknowledgments Special thanks to the ProtoNet team for support at all stages of ProtoNet development.  ... 
doi:10.1110/ps.062185706 pmid:16672244 pmcid:PMC2242553 fatcat:invabbcwcfb5rlgzfvv7kzkx3q

GreenPhylDB: A Gene Family Database for plant functional Genomics

Mathieu Rouard, Matthieu Conte, Marie-Angélique Laporte, Christophe Périn
2009 Nature Precedings  
GreenPhylDB v2.0a contains groups of protein-coding gene sequences automatically clustered from 12 complete genomes of plants ( fig. 1 ) that cover most of the taxonomy of green plants.  ...  Nowadays, most of the manual annotation in biology is done on gene sequences or protein patterns but relatively little is done for gene families at large.  ...  This section sums up high quality annotations available in external databases for protein sequences of a cluster.  ... 
doi:10.1038/npre.2009.3136.1 fatcat:5gmt6mvemvgbpmisarpvhvia3m

CluSTr: a database of clusters of SWISS-PROT+TrEMBL proteins

E. V. Kriventseva
2001 Nucleic Acids Research  
The CluSTr (Clusters of SWISS-PROT and TrEMBL proteins) database offers an automatic classification of SWISS-PROT and TrEMBL proteins into groups of related proteins.  ...  The clustering is based on analysis of all pairwise comparisons between protein sequences.  ...  We are also grateful to Beate Marx for administration of the relational database and helpful comments. This work was supported in part by grant B104-CT97-2099 of the European Commission.  ... 
doi:10.1093/nar/29.1.33 pmid:11125042 pmcid:PMC29804 fatcat:yo4y6wfvhzhd3gcqpryyn3vqhi

Automatic protein clustering as a basis of automatic annotation

Naoki Sato, Naoki Sato
2010 Nature Precedings  
Use of a single threshold should produce unusually large clusters containing unrelated proteins and divergent paralogs.  ...  Gclust clusters are suitable for annotation of data from new generation sequencer.  ...  • Cluster-based annotation is not susceptible for such inappropriate inheritance of annotation, even though individual annotations (given for original databases) may be variable or sometimes unreliable  ... 
doi:10.1038/npre.2010.5086 fatcat:5bopev43tzbrvpfrvp2ewmzwyq

Automatic protein clustering as a basis of automatic annotation

Naoki Sato, Naoki Sato
2010 Nature Precedings  
Use of a single threshold should produce unusually large clusters containing unrelated proteins and divergent paralogs.  ...  Gclust clusters are suitable for annotation of data from new generation sequencer.  ...  • Cluster-based annotation is not susceptible for such inappropriate inheritance of annotation, even though individual annotations (given for original databases) may be variable or sometimes unreliable  ... 
doi:10.1038/npre.2010.5086.1 fatcat:fuhp3k3c6zgsxgvtdd3yz5u5h4

Pfam: multiple sequence alignments and HMM-profiles of protein domains

E. Sonnhammer
1998 Nucleic Acids Research  
The definition of domain boundaries, family members and alignment is done semi-automatically based on expert knowledge, sequence similarity, other protein family databases and the ability of HMM-profiles  ...  ://genome.wustl.edu/Pfam/ Pfam 2.0 matches one or more domains in 50% of Swissprot-34 sequences, and 25% of a large sample of predicted proteins from the Caenorhabditis elegans genome.  ...  ACKNOWLEDGEMENTS We thank Robert Finn for preparing most of the new families for Pfam 2.0, and Jose Aguilar for writing and maintaining the Washington University Pfam server.  ... 
doi:10.1093/nar/26.1.320 pmid:9399864 pmcid:PMC147209 fatcat:a2np54xh3vgqrf2vcxkzbswudy

Annotated Expressed Sequence Tags (ESTs) from pre-smolt Atlantic salmon (Salmo salar) in a searchable data resource

Alexei A Adzhubei, Anna V Vlasova, Heidi Hagen-Larsen, Torgeir A Ruden, Jon K Laerdahl, Bjørn Høyheim
2007 BMC Genomics  
Conclusion: We describe the construction of cDNA libraries from juvenile/pre-smolt Atlantic salmon (Salmo salar), EST sequencing, clustering, and annotation by assigning putative function to the transcripts  ...  Such libraries allow identification of a large number of different transcripts and can provide valuable information on genes expressed in a particular tissue at a specific developmental stage.  ...  This work was supported by grant 139617/140 "Salmon Genome Project (SGP)" by the Research Council of Norway.  ... 
doi:10.1186/1471-2164-8-209 pmid:17605782 pmcid:PMC1913521 fatcat:3sooi3brbzf3zm3msdki5szzbm

The SYSTERS protein sequence cluster set

A. Krause
2000 Nucleic Acids Research  
The SYSTERS (short for SYSTEmatic Re-Searching) protein sequence cluster set consists of the classification of all sequences from SWISS-PROT and PIR into disjoint protein family clusters and hierarchically  ...  The cluster set can be searched with a sequence using the SSMAL search tool or a traditional database search tool like BLAST or FASTA.  ...  CONCLUSIONS The SYSTERS protein sequence cluster set provides an automatically generated classification of all sequences of the SWISS-PROT and PIR databases into families, superfamilies and subfamilies  ... 
doi:10.1093/nar/28.1.270 pmid:10592244 pmcid:PMC102384 fatcat:hgvkgpydtvfn3k4cfxgs7z22lq

TBestDB: a taxonomically broad database of expressed sequence tags (ESTs)

E. A. O'Brien, L. B. Koski, Y. Zhang, L. Yang, E. Wang, M. W. Gray, G. Burger, B. F. Lang
2007 Nucleic Acids Research  
The TBestDB database contains $370 000 clustered expressed sequence tag (EST) sequences from 49 organisms, covering a taxonomically broad range of poorly studied, mainly unicellular eukaryotes, and includes  ...  The datasets are automatically checked for clustering errors due to chimerism and potential crosscontamination between organisms, and suspect data are flagged in or removed from the database.  ...  ACKNOWLEDGEMENTS Conflict of interest statement. None declared.  ... 
doi:10.1093/nar/gkl770 pmid:17202165 pmcid:PMC1899108 fatcat:o7pn4nlurjcshododpda4u7pta

ESAP plus: a web-based server for EST-SSR marker development

Piyarat Ponyared, Jiradej Ponsawat, Sissades Tongsima, Pusadee Seresangtakul, Chutipong Akkasaeng, Nathpapat Tantisuwichwong
2016 BMC Genomics  
With the advent of highthroughput sequencing technology, huge EST sequence data have been generated and are now accessible from many public databases.  ...  Some of these computational tools are not users friendly and must be tightly integrated with reference genomic databases.  ...  The full contents of the supplement are available online at https:// bmcgenet.biomedcentral.com/articles/supplements/volume-17-supplement-13.  ... 
doi:10.1186/s12864-016-3328-4 pmid:28155670 pmcid:PMC5260030 fatcat:ksvwf4wqffagbih5lcheszas4e

Improvements to CluSTr: the database of SWISS-PROT+TrEMBL protein clusters

E. V. Kriventseva
2003 Nucleic Acids Research  
The CluSTr database (http://www.ebi.ac.uk/clustr/) offers an automatic classification of SWISS-PROT þ TrEMBL proteins into groups of related proteins.  ...  The clustering is based on analysis of all pair-wise sequence comparisons between proteins using the Smith-Waterman algorithm.  ...  This work was supported in part by grant B104-CT97-2099 of the European Commission.  ... 
doi:10.1093/nar/gkg035 pmid:12520029 pmcid:PMC165482 fatcat:thgup2dhwvhzxabeyuegy3xmde

ProtoNet 4.0: A hierarchical classification of one million protein sequences

N. Kaplan
2004 Nucleic Acids Research  
ProtoNet is an automatic hierarchical classification of the protein sequence space.  ...  A large portion of these clusters was automatically assigned high confidence biological names according to their correspondence with functional annotations.  ...  Special thanks to Alex Savenok for the web design as well as for the development of the visualization tools.  ... 
doi:10.1093/nar/gki007 pmid:15608180 pmcid:PMC539961 fatcat:iuyjn7psr5fpfg5dzot6bhp5ie

ParPEST: a pipeline for EST data analysis based on parallel computing

Nunzio D'Agostino, Mario Aversano, Maria Chiusano
2005 BMC Bioinformatics  
Because of the advances in biotechnologies, ESTs are daily determined in the form of large datasets.  ...  Expressed Sequence Tags (ESTs) are short and error-prone DNA sequences generated from the 5' and 3' ends of randomly selected cDNA clones.  ...  Acknowledgements This work is supported by the Agronanotech Project (Ministry of Agriculture, Italy). We thank Prof. Luigi Frusciante and Prof. Gerardo Toraldo for all their support to our work.  ... 
doi:10.1186/1471-2105-6-s4-s9 pmid:16351758 pmcid:PMC1866376 fatcat:dtevffejpjc2ne6vlmatvg5ary
« Previous Showing results 1 — 15 out of 140,367 results