PartTree: an algorithm to build an approximate tree from a large number of unaligned sequences

K. Katoh, H. Toh
2006 Bioinformatics  
Motivation: To construct a multiple sequence alignment (MSA) of a large number (>$10 000) of sequences, the calculation of a guide tree with a complexity of O (N 2 ) to O (N 3 ), where N is the number of sequences, is the most time-consuming process. Results: To overcome this limitation, we have developed an approximate algorithm, PartTree, to construct a guide tree with an average time complexity of O (N log N ). The new MSA method with the PartTree algorithm can align $60 000 sequences in
more » ... 00 sequences in several minutes on a standard desktop computer. The loss of accuracy in MSA caused by this approximation was estimated to be several percent in benchmark tests using Pfam. Availability: The present algorithm has been implemented in the MAFFT sequence alignment package
doi:10.1093/bioinformatics/btl592 pmid:17118958 fatcat:nesvpujcpbfz7kfaldp2pyteeq