A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2018; you can also visit the original URL.
The file type is application/pdf
.
Filters
AntiFam: a tool to help identify spurious ORFs in protein annotation
2012
Database: The Journal of Biological Databases and Curation
Acknowledgements We are grateful to James Tripp from University of California Santa Cruz, who took the time to alert us to one of these spurious families. ...
Once one gene has been spuriously predicted and put in the sequence database, there is a danger that future genome projects will annotate new protein-coding genes by similarity to the first spurious ORF ...
These models are designed to identify commonly recurring spuriously predicted ORFs. ...
doi:10.1093/database/bas003
pmid:22434837
pmcid:PMC3308159
fatcat:274zzsfhfzfhxlz5bwp3kvtsl4
Gene Unprediction with Spurio: A tool to identify spurious protein sequences
2018
F1000Research
We have developed the Spurio tool to help identify spurious protein predictions in prokaryotes. ...
Our initial experiments suggest that less than 1% of the proteins in the UniProtKB sequence database are likely to be spurious and that Spurio is able to identify over 60 times more spurious proteins than ...
Grant information The authors declare that no grants were involved in supporting this work ...
doi:10.12688/f1000research.14050.1
pmid:29721311
pmcid:PMC5897793
fatcat:c2ulay4ou5f55jv6dpfum676g4
Thousands of missed genes found in bacterial genomes and their analysis with COMBREX
2012
Biology Direct
Here we draw attention to a large number of likely genes missing from annotations using common tools such as Glimmer and BLAST. ...
Annotation methods vary considerably and may fail to identify some genes. ...
The other two spurious genes in this set, called translations of CRISPR regions by AntiFam, are homologs to two genes in Syntrophus aciditrophicus SB that were annotated as a "putative cytoplasmic protein ...
doi:10.1186/1745-6150-7-37
pmid:23111013
pmcid:PMC3534567
fatcat:f64hxyyxureh5drimum5boar24
Referee report. For: Gene Unprediction with Spurio: A tool to identify spurious protein sequences [version 1; referees: 2 approved]
2018
We have developed the Spurio tool to help identify spurious protein predictions in prokaryotes. ...
Our initial experiments suggest that less than 1% of the proteins in the UniProtKB sequence database are likely to be spurious and that Spurio is able to identify over 60 times more spurious proteins than ...
In this paper we begin to address this problem by creating a generic tool to identify spurious proteins. We term the task of identifying and deleting spurious gene predictions as gene unprediction. ...
doi:10.5256/f1000research.15280.r31445
fatcat:bgnqozvbw5dixlvlgoopq3orhe
Loose ends: almost one in five human genes still have unresolved coding status
2018
Nucleic Acids Research
A further 1470 genes annotated as coding in all three reference sets have characteristics that are typical of non-coding genes or pseudogenes. ...
Data from large-scale genetic variation analyses suggests that most are not under protein-like purifying selection and so are unlikely to code for functional proteins. ...
ACKNOWLEDGEMENTS The authors would like to thank Iakes Ezkurdia and Jon Mudge for their input on this paper. ...
doi:10.1093/nar/gky587
pmid:29982784
pmcid:PMC6101605
fatcat:4pyw4qdhtndkbcs7yag3prxcee