The Undirected Incomplete Perfect Phylogeny Problem

R.V. Satya, A. Mukherjee
2008 IEEE/ACM Transactions on Computational Biology & Bioinformatics  
The incomplete perfect phylogeny (IPP) problem and the incomplete perfect phylogeny haplotyping (IPPH) problem deal with constructing a phylogeny for a given set of haplotypes or genotypes with missing entries. The earlier approaches for both these problems dealt with restricted versions of the problem, where the root is either available or can be trivially reconstructed from the data, or certain assumptions were made about the data. In this paper, we deal with the unrestricted version of the
more » ... oblem, where the root of the phylogeny is neither available nor trivially recoverable from the data. The conditions under which a set of incomplete haplotypes admit a unique perfect phylogeny has remained an open problem. We solve this open problem and state a set of necessary and sufficient conditions for a given set of incomplete haplotypes or genotypes to admit a perfect phylogeny. Both IPP and IPPH problems have previously been proven to be NP-complete. Here, we present efficient algorithms that can handle practical instances of the problem. Empirical analysis on simulated data shows that the algorithms take less than a second on data involving as many as hundred SNP loci. An implementation of our method will be made available online shortly. Contact:
doi:10.1109/tcbb.2007.70218 pmid:18989047 fatcat:wc2p26idcrgmrgf2odfscfhnqq