Filters








683 Hits in 4.4 sec

RefSeq: an update on prokaryotic genome annotation and curation

Daniel H Haft, Michael DiCuccio, Azat Badretdin, Vyacheslav Brover, Vyacheslav Chetvernin, Kathleen O'Neill, Wenjun Li, Farideh Chitsaz, Myra K Derbyshire, Noreen R Gonzales, Marc Gwadz, Fu Lu (+9 others)
2017 Nucleic Acids Research  
(HMMs), release of an updated pipeline (PGAP-4), and comprehensive re-annotation of RefSeq prokaryotic genomes.  ...  Genomes are annotated by a single Prokaryotic Genome Annotation Pipeline (PGAP) to provide users with a resource that is as consistent and accurate as possible.  ...  sequence regions (CDS) annotated on RefSeq genomes.  ... 
doi:10.1093/nar/gkx1068 pmid:29112715 pmcid:PMC5753331 fatcat:hgvzb5wly5flvfsj7wlev6myby

EcoGene-RefSeq: EcoGene tools applied to the RefSeq prokaryotic genomes

J. Zhou, A. J. Richardson, K. E. Rudd
2013 Bioinformatics  
EcoGene is a major source of annotation updates for the MG1655 Genbank record, one of only a few Genbank genome records that are updated by a community effort.  ...  EcoGene-RefSeq is being developed as a stand-alone internet resource to facilitate the usage of EcoGene-based tools on any of the 42400 completed prokaryotic genome records that are currently available  ...  In the future, we can add capabilities into EcoGene-RefSeq, including manual curation tools enabling an individual or interested group to build and re-annotate an EcoGene-like database for any prokaryotic  ... 
doi:10.1093/bioinformatics/btt302 pmid:23736533 pmcid:PMC3712216 fatcat:jon55wz7lbcoporxe3hb2u2emu

Update on RefSeq microbial genomes resources

Tatiana Tatusova, Stacy Ciufo, Scott Federhen, Boris Fedorov, Richard McVeigh, Kathleen O'Neill, Igor Tolstoy, Leonid Zaslavsky
2014 Nucleic Acids Research  
A number of improvements have been incorporated into the Prokaryotic Genome Annotation Pipeline.  ...  Several new features have been added to RefSeq prokaryotic genomes data processing pipeline including the calculation of genome groups (clades) and the optimization of protein clusters generation using  ...  All RefSeq genomes are annotated by NCBI pipeline except for the Reference genomes manually curated by community and NCBI staff (2).  ... 
doi:10.1093/nar/gku1062 pmid:25510495 pmcid:PMC4383903 fatcat:52oqwfkoircovp4jgymgm4nhry

RefSeq microbial genomes database: new representation and annotation strategy

Tatiana Tatusova, Stacy Ciufo, Boris Fedorov, Kathleen O'Neill, Igor Tolstoy
2013 Nucleic Acids Research  
New strategies have been developed for the annotation and representation of reference genomes and sequence variations derived from population studies and clinical outbreaks.  ...  This huge increase in DNA sequence data presents new challenges for the annotation, analysis and visualization bioinformatics tools.  ...  RefSeq prokaryotic genomes are organized in several new categories based on curated attributes and assembly and annotation quality measures.  ... 
doi:10.1093/nar/gkt1274 pmid:24316578 pmcid:PMC3965038 fatcat:7yynuxostnhpjlx6mcrezaekqq

RefSeq microbial genomes database: new representation and annotation strategy

T. Tatusova, S. Ciufo, B. Fedorov, K. O'Neill, I. Tolstoy
2015 Nucleic Acids Research  
New strategies have been developed for the annotation and representation of reference genomes and sequence variations derived from population studies and clinical outbreaks.  ...  This huge increase in DNA sequence data presents new challenges for the annotation, analysis and visualization bioinformatics tools.  ...  RefSeq prokaryotic genomes are organized in several new categories based on curated attributes and assembly and annotation quality measures.  ... 
doi:10.1093/nar/gkv278 pmid:25824943 pmcid:PMC4402550 fatcat:o3opkwpazfdvrkml2uliuxy2py

Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation

Nuala A. O'Leary, Mathew W. Wright, J. Rodney Brister, Stacy Ciufo, Diana Haddad, Rich McVeigh, Bhanu Rajput, Barbara Robbertse, Brian Smith-White, Danso Ako-Adjei, Alexander Astashyn, Azat Badretdin (+43 others)
2015 Nucleic Acids Research  
The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http:/  ...  We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing.  ...  and accuracy of the represented sequence, structural annotation, and functional annotation.  ... 
doi:10.1093/nar/gkv1189 pmid:26553804 pmcid:PMC4702849 fatcat:2bm7d5coyvfotaj23hnj3mce6m

NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy

K. D. Pruitt, T. Tatusova, G. R. Brown, D. R. Maglott
2011 Nucleic Acids Research  
We report here on recent growth, the status of curating the human RefSeq data set, more extensive feature annotation and current policy for eukaryotic genome annotation via the NCBI annotation pipeline  ...  The database includes over 16 000 organisms, 2.4 Â 10 6 genomic records, 13 Â 10 6 proteins and 2 Â 10 6 RNA records spanning prokaryotes, eukaryotes and viruses (RefSeq release 49, September 2011).  ...  RefSeq genome representation for prokaryotes is currently managed by propagating annotation from the primary genome data in GenBank, calculating annotation for RefSeq when annotation is not available in  ... 
doi:10.1093/nar/gkr1079 pmid:22121212 pmcid:PMC3245008 fatcat:n2exzcnlwbfi7ef24fcdeyuaca

NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins

K. D. Pruitt
2004 Nucleic Acids Research  
from the scientific community, automated annotation, propagation from GenBank and curation by NCBI staff.  ...  The database incorporates data from over 2400 organisms and includes over one million proteins representing significant taxonomic diversity spanning prokaryotes, eukaryotes and viruses.  ...  The collection is curated on an ongoing basis by collaborating groups and by NCBI staff. Sequence records are presented in a standard format and are subject to computational validation.  ... 
doi:10.1093/nar/gki025 pmid:15608248 pmcid:PMC539979 fatcat:gneoq6nakbeptfwoymdt4kfj4e

RefSeq curation and annotation of stop codon recoding in vertebrates

Bhanu Rajput, Kim D Pruitt, Terence D Murphy
2018 Nucleic Acids Research  
Gene annotations were curated in nine vertebrate model organisms and integrated into NCBI's Reference Sequence (RefSeq) dataset, resulting in 247 selenoprotein genes encoding 322 selenoproteins, and 93  ...  Our goal was to provide accurately curated and annotated datasets of selenoprotein and SCR transcript and protein records to serve as annotation standards and to promote basic and biomedical research.  ...  ACKNOWLEDGEMENTS We would like to acknowledge RefSeq curators Catherine Farrell and David Webb, and developers Vamsi Kodali and Alexander Souvorov for helpful consults.  ... 
doi:10.1093/nar/gky1234 pmid:30535227 pmcid:PMC6344875 fatcat:7uo7isrv7rfnji4nwlwhhrspme

NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins

K. D. Pruitt, T. Tatusova, D. R. Maglott
2007 Nucleic Acids Research  
The format of all RefSeq records is validated, and an increasing number of tests are being applied to evaluate the quality of sequence and annotation, especially in the context of complete genomic sequence  ...  NCBI's reference sequence (RefSeq) database (http://www.ncbi.nlm.nih.gov/RefSeq/) is a curated non-redundant collection of sequences representing genomes, transcripts and proteins.  ...  gene annotations on the human and mouse genomes.  ... 
doi:10.1093/nar/gkl842 pmid:17130148 pmcid:PMC1716718 fatcat:t7rciayrzncfnmxhdp44hjk76a

NCBI Reference Sequences: current status, policy and new initiatives

K. D. Pruitt, T. Tatusova, W. Klimke, D. R. Maglott
2009 Nucleic Acids Research  
NCBI's Reference Sequence (RefSeq) database (http://www.ncbi.nlm.nih.gov/RefSeq/) is a curated non-redundant collection of sequences representing genomes, transcripts and proteins.  ...  We report here on the recent growth of the database, recent changes to feature annotations and record types for eukaryotic (primarily vertebrate) species and policies regarding species inclusion and genome  ...  NCBI's computational genome annotation pipelines are used to annotate some prokaryotic and eukaryotic genomes.  ... 
doi:10.1093/nar/gkn721 pmid:18927115 pmcid:PMC2686572 fatcat:77xptvs54fghxb3dur3ykwycgq

NCBI prokaryotic genome annotation pipeline

Tatiana Tatusova, Michael DiCuccio, Azat Badretdin, Vyacheslav Chetvernin, Eric P. Nawrocki, Leonid Zaslavsky, Alexandre Lomsadze, Kim D. Pruitt, Mark Borodovsky, James Ostell
2016 Nucleic Acids Research  
The pipeline provides a framework for generation and analysis of annotation on the full breadth of prokaryotic taxonomy.  ...  Thus, the new NCBI's Prokaryotic Genome Annotation Pipeline (PGAP) relies more on sequence similarity when confident comparative data are available, while it relies more on statistical predictions in the  ...  ACKNOWLEDGEMENTS The authors would like to thank Dr David Lipman for many fruitful discussions about prokaryotic biology, insightful suggestions on improving the annotation results and his continuous support  ... 
doi:10.1093/nar/gkw569 pmid:27342282 pmcid:PMC5001611 fatcat:6wcveeu35bcixgypu5i2nmnbba

The National Center for Biotechnology Information's Protein Clusters Database

William Klimke, Richa Agarwala, Azat Badretdin, Slava Chetvernin, Stacy Ciufo, Boris Fedorov, Boris Kiryutin, Kathleen O'Neill, Wolfgang Resch, Sergei Resenchuk, Susan Schafer, Igor Tolstoy (+1 others)
2008 Nucleic Acids Research  
There are 7180 clusters containing 376 513 proteins with curated gene and protein functional annotation.  ...  ProtClustDB provides an efficient method to aggregate gene and protein annotation for researchers and is available at http://www.ncbi.nlm.nih.gov/ sites/entrez?db=proteinclusters.  ...  It was realized that annotating protein families as a group was a convenient and efficient way to functionally annotate the increasing numbers of prokaryotic genomes that were being deposited at an increasing  ... 
doi:10.1093/nar/gkn734 pmid:18940865 pmcid:PMC2686591 fatcat:rtrzuma6ubb33i6gsivhsi7yca

Solving the Problem: Genome Annotation Standards before the Data Deluge

William Klimke, Claire O'Donovan, Owen White, J. Rodney Brister, Karen Clark, Boris Fedorov, Ilene Mizrachi, Kim D. Pruitt, Tatiana Tatusova
2011 Standards in Genomic Sciences  
Standards in Genomic Sciences Table 1 (cont.) Databases, tools,resources for genomes and annotation.  ...  prokaryotic genomes are available as gold standard references.  ...  Craig Venter Institute for hosting the workshop and especially Tanja Davidsen and Ramana Madupu for help in the organization before, during, and after the workshop.  ... 
doi:10.4056/sigs.2084864 pmid:22180819 pmcid:PMC3236044 fatcat:57nszc6myncm5grcgonmngldl4

The MACADAM database: a MetAboliC pAthways DAtabase for Microbial taxonomic groups for mining potential metabolic capacities of archaeal and bacterial taxonomic groups

2019 Database: The Journal of Biological Databases and Curation  
For each prokaryotic 'RefSeq complete genome', MACADAM builds a pathway genome database (PGDB) using Pathway Tools software based on MetaCyc data that includes metabolic pathways as well as associated  ...  To ensure the highest quality of the genome functional annotation data, MACADAM also contains MicroCyc, a manually curated collection of PGDBs; Functional Annotation of Prokaryotic Taxa (FAPROTAX), a manually  ...  Acknowledgements The authors are grateful to the GenoToul Bioinformatics Platform, Toulouse, Occitanie, for providing assistance, computing and storage resources.  ... 
doi:10.1093/database/baz049 pmid:31032842 pmcid:PMC6487390 fatcat:xt5futbmqzb5ti24egyouuviii
« Previous Showing results 1 — 15 out of 683 results