Filters








5 Hits in 3.9 sec

AntiFam: a tool to help identify spurious ORFs in protein annotation

R. Y. Eberhardt, D. H. Haft, M. Punta, M. Martin, C. O'Donovan, A. Bateman
2012 Database: The Journal of Biological Databases and Curation  
Acknowledgements We are grateful to James Tripp from University of California Santa Cruz, who took the time to alert us to one of these spurious families.  ...  Once one gene has been spuriously predicted and put in the sequence database, there is a danger that future genome projects will annotate new protein-coding genes by similarity to the first spurious ORF  ...  These models are designed to identify commonly recurring spuriously predicted ORFs.  ... 
doi:10.1093/database/bas003 pmid:22434837 pmcid:PMC3308159 fatcat:274zzsfhfzfhxlz5bwp3kvtsl4

Gene Unprediction with Spurio: A tool to identify spurious protein sequences

Wolfram Höps, Matt Jeffryes, Alex Bateman
2018 F1000Research  
We have developed the Spurio tool to help identify spurious protein predictions in prokaryotes.  ...  Our initial experiments suggest that less than 1% of the proteins in the UniProtKB sequence database are likely to be spurious and that Spurio is able to identify over 60 times more spurious proteins than  ...  Grant information The authors declare that no grants were involved in supporting this work  ... 
doi:10.12688/f1000research.14050.1 pmid:29721311 pmcid:PMC5897793 fatcat:c2ulay4ou5f55jv6dpfum676g4

Thousands of missed genes found in bacterial genomes and their analysis with COMBREX

Derrick E Wood, Henry Lin, Ami Levy-Moonshine, Rajiswari Swaminathan, Yi-Chien Chang, Brian P Anton, Lais Osmani, Martin Steffen, Simon Kasif, Steven L Salzberg
2012 Biology Direct  
Here we draw attention to a large number of likely genes missing from annotations using common tools such as Glimmer and BLAST.  ...  Annotation methods vary considerably and may fail to identify some genes.  ...  The other two spurious genes in this set, called translations of CRISPR regions by AntiFam, are homologs to two genes in Syntrophus aciditrophicus SB that were annotated as a "putative cytoplasmic protein  ... 
doi:10.1186/1745-6150-7-37 pmid:23111013 pmcid:PMC3534567 fatcat:f64hxyyxureh5drimum5boar24

Referee report. For: Gene Unprediction with Spurio: A tool to identify spurious protein sequences [version 1; referees: 2 approved]

Daniel H. Haft
2018
We have developed the Spurio tool to help identify spurious protein predictions in prokaryotes.  ...  Our initial experiments suggest that less than 1% of the proteins in the UniProtKB sequence database are likely to be spurious and that Spurio is able to identify over 60 times more spurious proteins than  ...  In this paper we begin to address this problem by creating a generic tool to identify spurious proteins. We term the task of identifying and deleting spurious gene predictions as gene unprediction.  ... 
doi:10.5256/f1000research.15280.r31445 fatcat:bgnqozvbw5dixlvlgoopq3orhe

Loose ends: almost one in five human genes still have unresolved coding status

Federico Abascal, David Juan, Irwin Jungreis, Laura Martinez, Maria Rigau, Jose Manuel Rodriguez, Jesus Vazquez, Michael L Tress
2018 Nucleic Acids Research  
A further 1470 genes annotated as coding in all three reference sets have characteristics that are typical of non-coding genes or pseudogenes.  ...  Data from large-scale genetic variation analyses suggests that most are not under protein-like purifying selection and so are unlikely to code for functional proteins.  ...  ACKNOWLEDGEMENTS The authors would like to thank Iakes Ezkurdia and Jon Mudge for their input on this paper.  ... 
doi:10.1093/nar/gky587 pmid:29982784 pmcid:PMC6101605 fatcat:4pyw4qdhtndkbcs7yag3prxcee