Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels

Marcel H. Schulz, Daniel R. Zerbino, Martin Vingron, Ewan Birney
2012 Bioinformatics  
Motivation: High-throughput sequencing has made the analysis of new model organisms more affordable. Although assembling a new genome can still be costly and difficult, it is possible to use RNAseq to sequence mRNA. In the absence of a known genome, it is necessary to assemble these sequences de novo, taking into account possible alternative isoforms and the dynamic range of expression values. Results: We present a software package named Oases designed to heuristically assemble RNA-seq reads in
more » ... the absence of a reference genome, across a broad spectrum of expression values and in presence of alternative isoforms. It achieves this by using an array of hash lengths, a dynamic filtering of noise, a robust resolution of alternative splicing events and the efficient merging of multiple assemblies. It was tested on human and mouse RNA-seq data and is shown to improve significantly on the transABySS and Trinity de novo transcriptome assemblers.
doi:10.1093/bioinformatics/bts094 pmid:22368243 pmcid:PMC3324515 fatcat:mxbrrybygzgpdar5jtht5ibnmm