Web-based bioinformatics workflows for end-to-end RNA-seq data computation and analysis in agricultural animal species

Weizhong Li, R. Alexander Richter, Yunsup Jung, Qiyun Zhu, Robert W. Li
2016 BMC Genomics  
Remarkable advances in Next Generation Sequencing (NGS) technologies, bioinformatics algorithms and computational technologies have significantly accelerated genomic research. However, complicated NGS data analysis still remains as a major bottleneck. RNA-seq, as one of the major area in the NGS field, also confronts great challenges in data analysis. Results: To address the challenges in RNA-seq data analysis, we developed a web portal that offers three integrated workflows that can perform
more » ... -to-end compute and analysis, including sequence quality control, read-mapping, transcriptome assembly, reconstruction and quantification, and differential analysis. The first workflow utilizes Tuxedo (Tophat, Cufflink, Cuffmerge and Cuffdiff suite of tools). The second workflow deploys Trinity for de novo assembly and uses RSEM for transcript quantification and EdgeR for differential analysis. The third combines STAR, RSEM, and EdgeR for data analysis. All these workflows support multiple samples and multiple groups of samples and perform differential analysis between groups in a single workflow job submission. The calculated results are available for download and post-analysis. The supported animal species include chicken, cow, duck, goat, pig, horse, rabbit, sheep, turkey, as well as several other model organisms including yeast, C. elegans, Drosophila, and human, with genomic sequences and annotations obtained from ENSEMBL. The RNA-seq portal is freely available from http://weizhongli-lab.org/RNA-seq. Conclusions: The web portal offers not only bioinformatics software, workflows, computation and reference data, but also an integrated environment for complex RNA-seq data analysis for agricultural animal species. In this project, our aim is not to develop new RNA-seq tools, but to build web workflows for using popular existing RNA-seq methods and make these tools more accessible to the communities.
doi:10.1186/s12864-016-3118-z pmid:27678198 pmcid:PMC5039875 fatcat:mrtnh7pgiradhfzu5gw34tkqwm