Filters








443 Hits in 1.3 sec

Databases of discovery

James Ostell
2005 Queue  
The National Center for Biotechnology Information (NCBI) is the part of the National Institutes of Health (NIH) responsible for the largest public bibliographic database in biomedicine (PubMed), the U.S. national DNA sequence database (GenBank), an online free full text research article database, PubMed Central (PMC), assembly, annotation, and distribution of a reference set of genes, genomes, and chromosomes (RefSeq) from human to viruses, through online text search and retrieval systems
more » ... z) and specialized molecular biology data search engines (BLAST, CDD search, others), as well dozens of other resources. At the time of writing this article, NCBI receives about 50 million web hits a day, at peak rates of about 1900 hits a second, and about 400,000 BLAST searches a day from about 2.5 million users a day. The web site transfers about 0.6 terabytes a day and people interested in local copies of bulk data ftp about 1.2 terabytes a day.
doi:10.1145/1059791.1059806 pmid:16467894 pmcid:PMC1343446 fatcat:3rnhanaw6rao3o5ul5ktmulyym

Digital BioCuration: A Question of Balance

James Ostell
2009 Nature Precedings  
Human Dog
doi:10.1038/npre.2009.3257.1 fatcat:ys2pz6sj2bbyxkbxbx3r5v45xu

GenBank

Dennis Benson, David J. Lipman, James Ostell
1993 Nucleic Acids Research  
The GenBank sequence database has undergone an expansion in data coverage, annotation content and the development of new services for the scientific community. In addition to nucleotide sequences, data from the major protein sequence and structural databases, and from U.S. and European patents is now included in an integrated system. MEDLINE abstracts from published articles describing the sequences provide an important new source of biological annotation for sequence entries. In addition to
more » ... continued support of existing services, new CD-ROM and network-based systems have been implemented for literature retrieval and sequence similarity searching. Major releases of GenBank are now more frequent and the data are distributed in several new forms for both end users and software developers.
doi:10.1093/nar/21.13.2963 pmid:8332518 pmcid:PMC309721 fatcat:4oqnfqywpnfybm66uzinsgbmwi

The NCBI Data Model [chapter]

James M. Ostell, Jonathan A. Kans
2006 Methods of biochemical analysis  
Detailed discussions about the choice of ASN.1 for this task and its overall form can be found elsewhere (Ostell, 1995) . What to Define?  ... 
doi:10.1002/9780470110607.ch6 fatcat:ainfsykafrg2lalrdz6tdeoj6u

GenBank

Dennis A. Benson, Mark Boguski, David J. Lipman, James Ostell
1994 Nucleic Acids Research  
The GenBank sequence database continues to expand its data coverage, quality control, annotation content and retrieval services for the scientific community. Besides handling direct submissions of sequence data from authors, GenBank also incorporates DNA sequences from all available public sources; an integrated retrieval system, known as Entrez, also makes available data from the major protein sequence and structural databases, and from U.S. and European patents. MEDLINE abstracts from
more » ... d articles describing the sequences are also included as an additional source of biological annotation for sequence entries. GenBank supports distribution of the data via FTP, CD-ROM, and E-mail servers. Network serverclient programs provide access to an integrated database for literature retrieval and sequence similarity searching.
doi:10.1093/nar/22.17.3441 pmid:7937042 pmcid:PMC308298 fatcat:ihc23fsh7rgixocbbqu6rfweyi

GenBank

Karen Clark, Ilene Karsch-Mizrachi, David J. Lipman, James Ostell, Eric W. Sayers
2015 Nucleic Acids Research  
GenBank R (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for over 340 000 formally described species. Recent developments include a new starting page for submitters, a shift toward using accession.version identifiers rather than GI numbers, a wizard for submitting 16S rRNA sequences, and an Identical Protein Report to address growing issues of data redundancy. GenBank organizes the sequence data received from individual
more » ... ies and largescale sequencing projects into 18 divisions, and Gen-Bank staff assign unique accession.version identifiers upon data receipt. Most submitters use the web-based BankIt or standalone Sequin programs. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the nuccore, nucest, and nucgss databases of the Entrez retrieval system, which integrates these records with a variety of other data including taxonomy nodes, genomes, protein structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP.
doi:10.1093/nar/gkv1276 pmid:26590407 pmcid:PMC4702903 fatcat:czyvsrb4gffjfop3wlvryqj7km

GenBank

Dennis A. Benson, Ilene Karsch-Mizrachi, David J. Lipman, James Ostell, Eric W. Sayers
2009 Nucleic Acids Research  
GenBank Õ is a comprehensive database that contains publicly available nucleotide sequences for more than 300 000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff
more » ... n receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bi-monthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI homepage: www.ncbi.nlm.nih.gov.
doi:10.1093/nar/gkp1024 pmid:19910366 pmcid:PMC2808980 fatcat:5hx2nirjdjbrfj3khqjbfp5avq

GenBank

Dennis A. Benson, Karen Clark, Ilene Karsch-Mizrachi, David J. Lipman, James Ostell, Eric W. Sayers
2013 Nucleic Acids Research  
GenBank Õ is a comprehensive database that contains publicly available nucleotide sequences for over 280 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assign accession numbers upon data receipt.
more » ... aily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the National Center for Biotechnology Information (NCBI) Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI home page: www.ncbi.nlm.nih.gov.
doi:10.1093/nar/gkt1030 pmid:24217914 pmcid:PMC3965104 fatcat:wjdyeblmtbh2nchbemgybvfsle

GenBank

Dennis A. Benson, Karen Clark, Ilene Karsch-Mizrachi, David J. Lipman, James Ostell, Eric W. Sayers
2014 Nucleic Acids Research  
GenBank R (http://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for over 300 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from largescale sequencing projects, including whole-genome shotgun and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assign
more » ... accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP.
doi:10.1093/nar/gku1216 pmid:25414350 pmcid:PMC4383990 fatcat:rrzuod6pgrah3hq4nyuy6bzpyy

GenBank

Dennis A. Benson, Mark Cavanaugh, Karen Clark, Ilene Karsch-Mizrachi, David J. Lipman, James Ostell, Eric W. Sayers
2016 Nucleic Acids Research  
GenBank ® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 370 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or the NCBI Submission Portal. GenBank staff assign
more » ... numbers upon data receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Nucleotide database, which links to related information such as taxonomy, genomes, protein sequences and structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. Recent updates include changes to policies regarding sequence identifiers, an improved 16S submission wizard, targeted loci studies, the ability to submit methylation and BioNano mapping files, and a database of anti-microbial resistance genes.
doi:10.1093/nar/gkw1070 pmid:27899564 pmcid:PMC5210553 fatcat:u2pjbbcpwvgifdolo6wmqt5vvi

GenBank

Dennis A. Benson, Mark Cavanaugh, Karen Clark, Ilene Karsch-Mizrachi, David J. Lipman, James Ostell, Eric W. Sayers
2012 Nucleic Acids Research  
GenBank Õ (http://www.ncbi.nlm.nih.gov) is a comprehensive database that contains publicly available nucleotide sequences for almost 260 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from largescale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and GenBank staff assigns
more » ... accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI home page: www.ncbi.nlm.nih.gov.
doi:10.1093/nar/gks1195 pmid:23193287 pmcid:PMC3531190 fatcat:jjf53eywsndvpbfees2ly4jmxi

GenBank

Dennis A Benson, Mark Cavanaugh, Karen Clark, Ilene Karsch-Mizrachi, James Ostell, Kim D Pruitt, Eric W Sayers
2017 Nucleic Acids Research  
GenBank ® (www.ncbi.nlm.nih.gov/genbank/) is a comprehensive database that contains publicly available nucleotide sequences for 400 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun and environmental sampling projects. Most submissions are made using BankIt, the National Center for Biotechnology Information (NCBI) Submission Portal, or
more » ... he tool tbl2asn. GenBank staff assign accession numbers upon data receipt. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Nucleotide database, which links to related information such as taxonomy, genomes, protein sequences and structures, and biomedical journal literature in PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. Recent updates include changes to sequence identifiers, submission wizards for 16S and Influenza sequences, and an Identical Protein Groups resource.
doi:10.1093/nar/gkx1094 pmid:29140468 pmcid:PMC5753231 fatcat:tcapoinydngldd7erbnmv3x2aq

A tool for aligning very similar DNA sequences

Kun-Mao Chao, Jinghui Zhang, James Ostell, Webb Miller
1997 Bioinformatics  
Results: We have produced a computer program, named sim3, that solves the following computational problem. Two DNA sequences are given, where the shorter sequence is very similar to some contiguous region of the longer sequence. Sim3 determines such a similar region of the longer sequence, and then computes an optimal set of single-nucleotide changes (i.e. insertions, deletions or substitutions) that will convert the shorter sequence to that region. Thus, the alignment scoring scheme is
more » ... to model sequencing errors, rather than evolutionary processes. The program can align a 100 kb sequence to a 1 megabase sequence in a few seconds on a workstation, provided that there are very few differences between the shorter sequence and some region in the longer sequence. The program has been used to assemble sequence data for the Genomes Division at the National Center for Biotechnology Information. Availability: A version of sim3 for UNIX machines can be obtained by anonymous ftp from ncbi. nlm. nih. gov, in the publsimS directory.
doi:10.1093/bioinformatics/13.1.75 fatcat:3brmm33avjfcholmsfkmdyz4am

A local alignment tool for very long DNA sequences

Kun-Mao Chao, Jinghui Zhang, James Ostell, Webb Miller
1995 Bioinformatics  
This paper presents a practical program, called sim2, for building local alignments of two sequences, each of which may be hundreds of kilobases long. Sim2 first constructs n best non-intersecting chains of "fragments," such as all occurrences of identical 5-tuples in each of two DNA sequences, for any specified n ≥ 1. Each chain is then refined by delivering an optimal alignment in a region delimited by the chain. Sim2 requires only space proportional to the size of the input sequences and the
more » ... output alignments, and the same source code runs on UNIX machines, on Macintosh, on PC, and on DEC ALPHA PC. We also describe an application of sim2 for aligning long DNA sequences from E. coli. Sim2 facilitates contig-building by providing a complete view of the related sequences, so differences can be analyzed and inconsistencies resolved. Examples are shown using the alignment display and editing functions from the software tool, ChromoScope.
doi:10.1093/bioinformatics/11.2.147 fatcat:ly57dervybbele25ya567kbxkm

Spidey: A Tool for mRNA-to-Genomic Alignments

Sarah J. Wheelan, Deanna M. Church, James M. Ostell
2001 Genome Research  
METHODS Spidey -Design and Overview Spidey is written in C and is incorporated in the NCBI Toolkit (Ostell 1996) .  ...  It relies heavily on the alignment manager (Wheelan and Ostell, unpubl.) , which is an indexing system used for easy management of and quick access to alignments and sets of alignments.  ... 
doi:10.1101/gr.195301 pmid:11691860 pmcid:PMC311166 fatcat:ibsx3vkkc5g7bjdwp6ovrnysxq
« Previous Showing results 1 — 15 out of 443 results