GraphSeq: Accelerating String Graph Construction for De Novo Assembly on Spark [article]

Chung-Tsai Su, Ming-Tai Chang, Yun-Chian Cheng, Yun-Lung Li, Yao-Ting Wang
2018 bioRxiv   pre-print
De novo genome assembly is an important application on both uncharacterized genome assembly and variant identification in a reference-unbiased way. In comparison with de Brujin graph, string graph is a lossless data representation for de novo assembly. However, string graph construction is computational intensive. We propose GraphSeq to accelerate string graph construction by leveraging the distributed computing framework.
doi:10.1101/321729 fatcat:jfjun5zperdixly67qontc4l5i