Genotype imputation using the Positional Burrows Wheeler Transform [article]

Simone Rubinacci, Olivier Delaneau, Jonathan Marchini
2019 bioRxiv   pre-print
Genotype imputation is the process of predicting unobserved genotypes in a sample of individuals using a reference panel of haplotypes. Increasing reference panel size poses ever increasing computational challenges for imputation methods. Here we present IMPUTE5, a genotype imputation method that can scale to reference panels with millions of samples. It achieves fast and memory-efficient imputation by selecting haplotypes using the Positional Burrows Wheeler Transform (PBWT), which are used as
more » ... conditioning states within the IMPUTE model. IMPUTE5 is 20x faster than MINIMAC4 and 3x faster than BEAGLE5, when using the HRC reference panel, and uses less memory than both these methods. IMPUTE5 scales sub-linearly with reference panel size. Keeping the number of imputed markers constant, a 100 fold increase in reference panel size requires less than twice the computation time.
doi:10.1101/797944 fatcat:6fsnffyru5bwxauq7xi7lqfqui