A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Filters
Clustering of reads with alignment-free measures and quality values
2015
Algorithms for Molecular Biology
Also results on de novo assembly and metagenomic reads classification show that the introduction of quality values improves over standard alignment-free measures. ...
To the best of our knowledge this is the first study that incorporates quality value information and k-mers counts, in the context of alignment-free measures, for the comparison of reads data. ...
In this paper we presented a family of alignment-free measures, called D q -type, that incorporate quality value information and k-mers counts for the comparison of reads data. ...
doi:10.1186/s13015-014-0029-x
pmid:25691913
pmcid:PMC4331138
fatcat:gjj5whpcezblvgglurwux4tjre
Fast comparison of genomic and meta-genomic reads with alignment-free measures based on quality values
2016
BMC Medical Genomics
Results: In this paper we present a family of alignment-free measures, called d q -type, that are based on k-mer counts and quality values. ...
In this context it will be fundamental to exploit quality values information within the framework of alignment-free measures. ...
The use of quality values within alignment-free measures on average improves the classification accuracy and the impact of quality values increases when the reads are more noisy and the coverage is low ...
doi:10.1186/s12920-016-0193-6
pmid:27535823
pmcid:PMC4989896
fatcat:dg6ox7nwwrcijakfk5bvct3eye
De novo clustering of long-read transcriptome data using a greedy, quality-value based algorithm
[article]
2018
bioRxiv
pre-print
To address this challenge, we develop isONclust, a clustering algorithm that is greedy (in order to scale) and makes use of quality values (in order to handle variable error rates). ...
A common bottleneck is the dearth of scalable and accurate algorithms for clustering long reads according to their gene family of origin. ...
Evaluation metrics: There exists several metrics to measure quality of clustering. We mainly use the V-measure and its two components completeness and homogeneity [45] . ...
doi:10.1101/463463
fatcat:eqh6gmqp6rcyzabdeqv3mpnoa4
MeShClust2: Application of alignment-free identity scores in clustering long DNA sequences
[article]
2018
bioRxiv
pre-print
ABSTRACTGrouping sequences into similar clusters is an important part of sequence analysis. Widely used clustering tools sacrifice quality for speed. ...
Although MeShClust outperformed related tools in terms of cluster quality, the alignment algorithm used for generating training data for the classifier was not scalable to longer sequences. ...
This research was supported mainly by funds from the Oklahoma Center for the Advancement of Science and Technology [PS17-015] and in part by internal funds provided by the College of Engineering and Natural ...
doi:10.1101/451278
fatcat:rxlexall6rd33dlh44kfkreika
Transcriptome Analysis for Non-Model Organism: Current Status and Best-Practices
[chapter]
2017
Applications of RNA-Seq and Omics Strategies - From Microorganisms to Human Health
In this chapter, we aim to provide an overview of the state-of-the-art methods including (i) quality check and pre-processing of raw reads, (ii) the pros and cons of de novo transcriptome assemblers, ( ...
In spite of immense potential of RNA-Seq-based methods, particularly in recovering full-length transcripts and spliced isoforms from short-reads, the accurate results can be only obtained by the procedures ...
Acknowledgements All authors contributed to the editing of the manuscript and the content is solely the responsibility of the authors. ...
doi:10.5772/intechopen.68983
fatcat:vatg4hbanrchxhuxbf3meb3hye
Searching for SNPs with cloud computing
2009
Genome Biology
Crossbow is a cloud-computing software tool that combines the aligner Bowtie and the SNP caller SOAPsnp. ...
Executing in parallel using Hadoop, Crossbow analyzes data comprising 38-fold coverage of the human genome in three hours using a 320-CPU cluster rented from a cloud computing service for about $85. ...
We also thank Miron Livny and his team for providing access to their compute cluster. ...
doi:10.1186/gb-2009-10-11-r134
pmid:19930550
pmcid:PMC3091327
fatcat:pppdfms72fe4lbfa25blly4l3i
Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory
2012
BMC Bioinformatics
Results: We describe the method BLASR (Basic Local Alignment with Successive Refinement) for mapping Single Molecule Sequencing (SMS) reads that are thousands of bases long, with divergence between the ...
Conclusions: The results indicate that it is possible to map SMS reads with high accuracy and speed. ...
Acknowledgements We thank Jon Sorenson, James Bullard, Eric Schadt, and Jonas Korlach for useful comments in writing this manuscript.
Author details ...
doi:10.1186/1471-2105-13-238
pmid:22988817
pmcid:PMC3572422
fatcat:yap5l2k3w5dabpjwl3zois7zzu
HUGO: Hierarchical mUlti-reference Genome cOmpression for aligned reads
2014
JAMIA Journal of the American Medical Informatics Association
free content of genomic fragments without quality values and 5× CR when quality values are included. ...
The k-means clustering aims at partitioning L r quality values into k clusters, so that the quality values within the same cluster can be replaced by the quality values of the cluster center. 31 We can ...
Provenance and peer review Not commissioned; externally peer reviewed. ...
doi:10.1136/amiajnl-2013-002147
pmid:24368726
pmcid:PMC3932469
fatcat:xpdcsupzxjg5fhc6i667fcyiti
Using expected sequence features to improve basecalling accuracy of amplicon pyrosequencing data
2016
BMC Bioinformatics
provides sequence characteristics that allow generation of a set of high confidence error-free sequences. ...
Pyrosequencing errors, consisting mainly of nucleotide insertions and deletions, are on the other hand likely to disrupt open reading frames. ...
using Usearch v5.2.32 with seeds (cluster member with highest number of replicate reads) as output [15] . ...
doi:10.1186/s12859-016-1032-7
pmid:27102804
pmcid:PMC4841065
fatcat:we7tkmxsufbe5ntsad5ntyzdsy
Open-Source Sequence Clustering Methods Improve the State Of the Art
2016
mSystems
Sequence clustering is a common early step in amplicon-based microbial community analysis, when raw sequencing reads are clustered into operational taxonomic units (OTUs) to reduce the run time of subsequent ...
Furthermore, we observed that stringent quality filtering, such as is done in UPARSE, can cause a significant underestimation of species abundance and diversity, leading to incorrect biological results ...
ACKNOWLEDGMENTS We thank William Walters, Amnon Amir, Amanda Birmingham, Embriette Hyde, and Daniel McDonald for their time and valuable suggestions to improve the quality of the manuscript. ...
doi:10.1128/msystems.00003-15
pmid:27822515
pmcid:PMC5069751
fatcat:vtt6hkurmzbrrektwi3xl5qh7e
Alignment-free sequence comparison: benefits, applications, and tools
2017
Genome Biology
Alignment-free sequence analyses have been applied to problems ranging from whole-genome phylogeny to the classification of protein families, identification of horizontally transferred genes, and detection ...
We address these questions and provide a guide to the currently available alignment-free sequence analysis tools. ...
with alignment-free
measures (k-mer based) and quality
values
Software (C++) [159]
http://www.dei.unipd.it/
ciompin/main/
qcluster.html
Reads error
correction
Lighter
Correction of sequencing ...
doi:10.1186/s13059-017-1319-7
pmid:28974235
pmcid:PMC5627421
fatcat:5s7yd22l7bbmpljqc7fj4cbifm
IMSEQ—a fast and error aware approach to immunogenetic sequence analysis
2015
Bioinformatics
This type of analysis requires efficient and unambiguous clonotype assignment to a large number of NGS read sequences, including the identification of the incorporated V and J gene segments and the CDR3 ...
Current tools have deficits with respect to performance, accuracy and documentation of their underlying algorithms and usage. ...
Funding This research was funded by the German Federal Ministry of Education and Research (BMBF) within the grants "Primage" (0315895A) to N.B. and "eKid" (01ZX1312A) to N.B. as well as by the Investitionsbank ...
doi:10.1093/bioinformatics/btv309
pmid:25987567
fatcat:lnequrncmrdqfhmchpf7fe4nqq
Reader preferences and behavior on Wikipedia
2014
Proceedings of the 25th ACM conference on Hypertext and social media - HT '14
We show that the most read articles do not necessarily correspond to those frequently edited, suggesting some degree of non-alignment between user reading preferences and author editing preferences. ...
Wikipedia is a collaboratively-edited online encyclopaedia that relies on thousands of editors to both contribute articles and maintain their quality. ...
For each longevity value, we plot the percentage of articles with that value. ...
doi:10.1145/2631775.2631805
dblp:conf/ht/LehmannMLLK14
fatcat:ex6qa5pq7bd7pp6mx3kwtrhwoa
Computational Methods for DNA Copy-Number Analysis of Tumors
[chapter]
2014
Msphere
With the help of a comprehensive multistep computational procedure described here, copy-number profiles of tumor tissues or individual tumor cells may be generated and interpreted, starting with data acquired ...
These include accounting for variation of ploidy and distilling somatic copy number alterations from the inherited background. ...
There is one tab-delimited line for each read aligned, showing the read ID, read sequence and quality scores and the alignment position in the reference genome. Biol. ...
doi:10.1007/978-1-4939-0992-6_20
pmid:25030933
pmcid:PMC5136461
fatcat:ylc5pd5wfra5vbvac3nb7ercwi
FreePSI: an alignment-free approach to estimating exon-inclusion ratios without a reference transcriptome
2017
Nucleic Acids Research
In this paper, we propose an alignment-free method, FreePSI, to perform genomewide estimation of exon-inclusion ratios from RNA-Seq data without relying on the guidance of a reference transcriptome. ...
We compare FreePSI with the existing methods on simulated and real RNA-seq data in terms of both accuracy and efficiency and show that it is able to achieve very good performance even though a reference ...
We would like to thank Dr Rui Jiang (Tsinghua University) for the support of computational resources, and the anonymous referees for many constructive suggestions. ...
doi:10.1093/nar/gkx1059
pmid:29136203
pmcid:PMC5778508
fatcat:4iyj2ivgybdbvlj3hvczpnzule
« Previous
Showing results 1 — 15 out of 116,032 results