Enriching public descriptions of marine phages using the Genomic Standards Consortium MIGS standard

Melissa Beth Duhaime, Renzo Kottmann, Dawn Field, Frank Oliver Glöckner
2011 Standards in Genomic Sciences  
In any sequencing project, the possible depth of comparative analysis is determined largely by the amount and quality of the accompanying contextual data. The structure, content, and storage of this contextual data should be standardized to ensure consistent coverage of all sequenced entities and facilitate comparisons. The Genomic Standards Consortium (GSC) has developed the "Minimum Information about Genome/Metagenome Sequences (MIGS/MIMS)" checklist for the description of genomes and here we
more » ... annotate all 30 publicly available marine bacteriophage sequences to the MIGS standard. These annotations build on existing International Nucleotide Sequence Database Collaboration (INSDC) records, and confirm, as expected that current submissions lack most MIGS fields. MIGS fields were manually curated from the literature and placed in XML format as specified by the Genomic Contextual Data Markup Language (GCDML). These "machine-readable" reports were then analyzed to highlight patterns describing this collection of genomes. Completed reports are provided in GCDML. This work represents one step towards the annotation of our complete collection of genome sequences and shows the utility of capturing richer metadata along with raw sequences.
doi:10.4056/sigs.621069 pmid:21677864 pmcid:PMC3111985 fatcat:l35i7jzkobenzjtecddurphw4u