Reconstruction of large phylogenetic trees: A parallel approach

Zhihua Du, Feng Lin, Usman W. Roshan
2005 Computational biology and chemistry  
Reconstruction of phylogenetic trees for very large datasets is a known example of a computationally hard problem. In this paper, we present a parallel computing model for the widely used Multiple Instruction Multiple Data (MIMD) architecture. Following the idea of divide-and-conquer, our model adapts the Recursive-DCM3 decomposition method to divide datasets into smaller subproblems. It distributes computation load over multiple processors so that each processor constructs subtrees on each
more » ... roblem within a batch in parallel. It finally collects the resulting trees and merges them into a supertree. The proposed model is flexible as far as methods for dividing and merging datasets are concerned. We show that our method greatly reduces the computational time of the sequential version of the program. As a case study, our parallel approach only takes 22.1 hours on four processors to outperform the best score to date (found at 123.7 hours by the sequential Rec-I-DCM3 program ) on one dataset. Developed with the standard message-passing library, MPI, the program can be recompiled and run on any MIMD systems.
doi:10.1016/j.compbiolchem.2005.06.003 pmid:16040277 fatcat:mi2aej3uu5dpfdv5zjw57mgd5y