IsoTree: A New Framework for De novo Transcriptome Assembly from RNA-seq Reads

Jin Zhao, Haodi Feng, Daming Zhu, Chi Zhang, Ying Xu
2018 IEEE/ACM Transactions on Computational Biology & Bioinformatics  
High-throughput sequencing of mRNA has made the deep and efficient probing of transcriptome more affordable. However, the vast amounts of short RNA-seq reads make de novo transcriptome assembly an algorithmic challenge. In this work, we present IsoTree, a novel framework for transcripts reconstruction in the absence of reference genomes. Unlike most of de novo assembly methods that build de Bruijn graph or splicing graph by connecting k-mers which are sets of overlapping substrings generated
more » ... m reads, IsoTree constructs splicing graph by connecting reads directly. For each splicing graph, IsoTree applies an iterative scheme of mixed integer linear program to build a prefix tree, called isoform tree. Each path from the root node of the isoform tree to a leaf node represents a plausible transcript candidate which will be pruned based on the information of paired-end reads. Experiments showed that in most cases IsoTree performs better than other leading transcriptome assembly programs. IsoTree is available at https://github.com/Jane110111107/IsoTree. Index Terms-RNA-seq, de novo assembly, alternative splicing, transcriptome.
doi:10.1109/tcbb.2018.2808350 pmid:29994455 fatcat:x6bsq7rkavftrlztlwg5ucqwiu