71,899 Hits in 3.6 sec

Preserving sequence annotations across reference sequences

Zuotian Tatum, Marco Roos, Andrew P Gibson, Peter EM Taschner, Mark Thompson, Erik A Schultes, Jeroen FJ Laros
2014 Journal of Biomedical Semantics  
Matching and comparing sequence annotations of different reference sequences is vital to genomics research, yet many annotation formats do not specify the reference sequence types or versions used.  ...  In doing so, we created the Reference Sequence Annotation to compensate for gaps in the SO and in its mapping to BFO, particularly for annotations that refer to versions of consensus reference sequences  ...  Conclusions We demonstrated a working data model of sequence annotations that can be preserved across different reference sequence assemblies.  ... 
doi:10.1186/2041-1480-5-s1-s6 pmid:25093075 pmcid:PMC4108922 fatcat:vsgfao5nhrhwrml65kinf4zxza

Truvari: Refined Structural Variant Comparison Preserves Allelic Diversity [article]

Adam C. English, Vipin K. Menon, Richard Gibbs, Ginger A. Metcalf, Fritz J. Sedlazeck
2022 bioRxiv   pre-print
As SV detection becomes more exact, algorithms to preserve this refined signal are needed.  ...  AbstractFor multi-sample structural variant analyses like merging, benchmarking, and annotation, the fundamental operation is to identify when two SVs are the same.  ...  The 95% sequence and size similarity thresholds from Strict merge has a well-balanced preservation of unique SVs and reduction of redundant SVs across individual samples for this call-set.  ... 
doi:10.1101/2022.02.21.481353 fatcat:4d5pvniyzngihnzzh5cgnb2agi

Selected papers from the 16th Annual Bio-Ontologies Special Interest Group Meeting

Larisa N Soldatova, Philippe Rocca-Serra, Michel Dumontier, Nigam H Shah
2014 Journal of Biomedical Semantics  
The six papers selected for this supplement span a wide range of topics including: ontology-based data integration, ontology-based annotation of scientific literature, ontology and data model development  ...  Tatum et al. present a working data model of sequence annotations that can be preserved across different reference sequence assemblies.  ...  Tatum et al. in their paper titled "Preserving sequence annotations across reference sequences" present an RDF data model for describing sequence annotation instances within an established ontological  ... 
doi:10.1186/2041-1480-5-s1-i1 pmcid:PMC4108850 fatcat:cyqs6udi2nbqfio3cj34qg7eq4

Green plant genomes: What we know in an era of rapidly expanding opportunities

W. John Kress, Douglas E. Soltis, Paul J. Kersey, Jill L. Wegrzyn, James H. Leebens-Mack, Morgan R. Gostel, Xin Liu, Pamela S. Soltis
2022 Proceedings of the National Academy of Sciences of the United States of America  
Furthermore, the annotation of plant genomes is at present undergoing intensive improvement.  ...  These genomes range in size from 12 Mb to 27.6 Gb and are biased toward agricultural crops with large branches of the green tree of life untouched by genomic-scale sequencing.  ...  The ultimate goal of producing de novo reference genomes from across the plant tree of life will require well-preserved tissues collected specifically for use in genome sequencing projects in a way that  ... 
doi:10.1073/pnas.2115640118 pmid:35042803 pmcid:PMC8795535 fatcat:rdgrbaa37rdzda6q7n2lygzuru

Transcriptional activity and strain-specific history of mouse pseudogenes

Cristina Sisu, Paul Muir, Adam Frankish, Ian Fiddes, Mark Diekhans, David Thybert, Duncan T. Odom, Paul Flicek, Thomas M. Keane, Tim Hubbard, Jennifer Harrow, Mark Gerstein
2020 Nature Communications  
Here, combining both manual curation and automatic pipelines, we present a genome-wide annotation of the pseudogenes in the mouse reference genome and 18 inbred mouse strains (available via the  ...  We also annotate 165 unitary pseudogenes in mouse, and 303, in human.  ...  We found 2,925 pseudogenes that are preserved across all strains.  ... 
doi:10.1038/s41467-020-17157-w pmid:32728065 pmcid:PMC7392758 fatcat:f644v4hikzf4td7qfyogqprfnm

Pseudogenes in the mouse lineage: transcriptional activity and strain-specific history [article]

Cristina Sisu, Paul Muir, Adam Frankish, Ian Fiddes, Mark Diekhans, David Thybert, Duncan Odom, Paul Flicek, Thomas Keane, Tim Hubbard, Jennifer Harrow, Mark Gerstein
2018 bioRxiv   pre-print
Here, we present a comprehensive genome-wide annotation of the pseudogenes in the mouse reference genome and associated strains.  ...  In turn, the mouse is an ideal platform for studying them, particularly with the availability of developmental transcriptional data and the sequencing of 18 strains.  ...  We observed that on average more than 97.7% of loci are preserved across the laboratory strains, and 96.7% of loci are preserved with respect to the wild-derived strains.  ... 
doi:10.1101/386656 fatcat:kywggxhkenendmyavpxfd3l7pa

Sediment Metagenomes as Time Capsules of Lake Microbiomes

Rebecca E. Garner, Irene Gregory-Eaves, David A. Walsh, Barbara J. Campbell
2020 mSphere  
reads and reference surface water metagenome assemblies.  ...  Overall, our study explored a novel application of whole-metagenome shotgun sequencing for discovering the DNA remains of a broad diversity of microorganisms preserved in lake sediments.  ...  Metagenome shotgun sequencing, assembly, and annotation.  ... 
doi:10.1128/msphere.00512-20 pmid:33148818 fatcat:xd6hj3ntqrgwzi772b24dvm53i

Visualizing the protein sequence universe

Larissa Stanberry, Eugene Kolker, Geoffrey Fox, Roger Higdon, Winston Haynes, Natali Kolker, William Broomall, Saliya Ekanayake, Adam Hughes, Yang Ruan, Judy Qiu
2012 Proceedings of the 3rd international workshop on Emerging computational methods for the life sciences - ECMLS '12  
protein annotation.  ...  As an annotation example, we used the interpolation approach to map the set of annotated archaeal proteins into the prokaryotic PSU.  ...  Each cluster contains one reference sequence and all proteins within the similarity threshold to the reference.  ... 
doi:10.1145/2483954.2483958 fatcat:4bwfbqq6tjgchcydva7qi3s7d4

TranscriptClean: Variant-aware correction of indels, mismatches, and splice junctions in long-read transcripts

Dana Wyman, Ali Mortazavi, Bonnie Berger
2018 Bioinformatics  
Therefore, we developed the package TranscriptClean to correct mismatches, microindels and noncanonical splice junctions in mapped transcripts using the reference genome while preserving known variants  ...  However, their high error rates are an obstacle to distinguishing novel transcript isoforms from sequencing artifacts.  ...  preserving known variants.  ... 
doi:10.1093/bioinformatics/bty483 pmid:29912287 pmcid:PMC6329999 fatcat:q3gq4u5wbzhkxkasclgwpq7ura

Geographic and Genomic Distribution of SARS-CoV-2 Mutations

Daniele Mercatelli, Federico M. Giorgi
2020 Frontiers in Microbiology  
We analyzed and annotated all SARS-CoV-2 mutations compared with the reference Wuhan genome NC_045512.2, observing an average of 7.23 mutations per sample.  ...  Our analysis shows the prevalence of single nucleotide transitions as the major mutational type across the world.  ...  Colors are assigned randomly but preserved across panels to facilitate tracking of identical types across continents.  ... 
doi:10.3389/fmicb.2020.01800 pmid:32793182 pmcid:PMC7387429 fatcat:c3u7ae3dwbdxvoq3adg4jzpy5y

Comparative analysis of pseudogenes across three phyla

Cristina Sisu, Baikang Pei, Jing Leng, Adam Frankish, Yan Zhang, Suganthi Balasubramanian, Rachel Harte, Daifeng Wang, Michael Rutenberg-Schoenberg, Wyatt Clark, Mark Diekhans, Joel Rozowsky (+3 others)
2014 Proceedings of the National Academy of Sciences of the United States of America  
However, there are no pseudogene orthologs preserved across all three species (Fig. 3A and SI Appendix, Table S2 ).  ...  regulatory roles. genome annotation | functional genomics | transcriptomics O ften referred to as "genomic fossils" (1) (2) (3) , pseudogenes are defined as disabled copies of protein-coding genes.  ... 
doi:10.1073/pnas.1407293111 pmid:25157146 pmcid:PMC4169933 fatcat:gdjizdrksrhzrcykydnkzj53ku

BRILIA: Integrated Tool for High-Throughput Annotation and Lineage Tree Assembly of B-Cell Repertoires

Donald W. Lee, Ilja V. Khavrutskii, Anders Wallqvist, Sina Bavari, Christopher L. Cooper, Sidhartha Chaudhury
2017 Frontiers in Immunology  
Furthermore, we show that the complete gene usage annotation and SHM identification across the entire CDR3 are essential for studying the B-cell affinity maturation process through immunosequencing methods  ...  For the sample sequences in the middle, the V, NVD, D, NDJ, and J segments are separated by a space, where a double space indicates a lack of N region (e.g., NDJ is absent initially).  ...  The simulated sequences preserved other details such as the frequent C  T and G  A mutations mediated by AID (50) (51) (52) (53) , preferential occurrence of A mutations over T mutations (referred to  ... 
doi:10.3389/fimmu.2016.00681 pmid:28144239 pmcid:PMC5239784 fatcat:ovs7wix6bjbdpfifdakqr2imhi

MEGARes: an antimicrobial resistance database for high throughput sequencing

Steven M. Lakin, Chris Dean, Noelle R. Noyes, Adam Dettenwanger, Anne Spencer Ross, Enrique Doster, Pablo Rovira, Zaid Abdo, Kenneth L. Jones, Jaime Ruiz, Keith E. Belk, Paul S. Morley (+1 others)
2016 Nucleic Acids Research  
Currently, antimicrobial resistance databases are tailored to smaller-scale, functional profiling of genes using highly descriptive annotations.  ...  MEGARes can be browsed as a stand-alone resource through the website or can be easily integrated into sequence analysis pipelines through download.  ...  Again, this is meant to preserve the nucleotide identity within groupings and maintain reasonable biological categories across the database.  ... 
doi:10.1093/nar/gkw1009 pmid:27899569 pmcid:PMC5210519 fatcat:57lztyoqgjc3nmai3rqnahz2zi

RATT: Rapid Annotation Transfer Tool

Thomas D. Otto, Gary P. Dillon, Wim S. Degrave, Matthew Berriman
2011 Nucleic Acids Research  
We have developed a method to rapidly provide accurate annotation for new genomes using previously annotated genomes as a reference.  ...  The method, implemented in a tool called RATT (Rapid Annotation Transfer Tool), transfers annotations from a high-quality reference to a new genome on the basis of conserved synteny.  ...  ACKNOWLEDGEMENTS We would like to thank Ulrike Bo¨hme for the annotation of P. berghei and testing the program. We thank Adam Reid and Jason Tsai for comments and reviewing the article.  ... 
doi:10.1093/nar/gkq1268 pmid:21306991 pmcid:PMC3089447 fatcat:monllyjjoja7rlzybozv6eha4m

gcType: a high-quality type strain genome database for microbial phylogenetic and functional research

Wenyu Shi, Qinglan Sun, Guomei Fan, Sugawara Hideaki, Ohkuma Moriya, Takashi Itoh, Yuguang Zhou, Man Cai, Song-Gun Kim, Jung-Sook Lee, Ivo Sedlacek, David R Arahal (+24 others)
2020 Nucleic Acids Research  
pipelines to form a high-quality reference database.  ...  Additionally, the information provided through gcType includes >12 000 publicly available type strain genome sequences from GenBank incorporated using quality control criteria and standard data annotation  ...  We would like to thank the supports from National Institute of Genetics (NIG) members, Professor Yasukazu Nakamura, Dr Yasuhiro Tanizawa and Asami Fukuda for their helps on data curation and annotation  ... 
doi:10.1093/nar/gkaa957 pmid:33119759 fatcat:7sn5nvlxgrhx5aw53k6emsyt2u
« Previous Showing results 1 — 15 out of 71,899 results