Design and Implementation of Parallelization of BLAST Algorithm Based on Spark

Zhen-yu LIU, Jing GAO, Zhi-jun SHEN, Fang ZHAO
<span title="2018-12-07">2018</span> <i title="DEStech Publications"> <a target="_blank" rel="noopener" href="" style="color: black;">DEStech Transactions on Computer Science and Engineering</a> </i> &nbsp;
BLAST (Basic Local Alignment Search Tool) is a local alignment algorithm, which has high accuracy and is used widely. It can reduce the running time of program while maintaining high precision, but it has performance bottleneck and low efficiency when comparing large gene data sets. Therefore, a distributed parallel method named Spark_BLAST based on Spark was proposed. The method uses Spark memory computation to identify and divide tasks, and realizes the distributed parallel computing of the
more &raquo; ... AST algorithm. Finally, the method was implemented on the Spark cluster with 5 nodes. Comparing with single machine shows that the speedup of Spark cluster can reach about 4 without changing the accuracy of the comparison result. The method provides an efficient alignment method for bioinformatics.
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="">doi:10.12783/dtcse/iece2018/26643</a> <a target="_blank" rel="external noopener" href="">fatcat:axf4f63hpbg7hl2qublv3h733y</a> </span>
