Probabilistic Inference of Viral Quasispecies Subject to Recombination [chapter]

Osvaldo Zagordi, Armin Töpfer, Sandhya Prabhakaran, Volker Roth, Eran Halperin, Niko Beerenwinkel
2012 Lecture Notes in Computer Science  
RNA viruses are present in a single host as a population of different but related strains. This population, shaped by the combination of genetic change and selection, is called quasispecies. Genetic change is due to both point mutations and recombination events. We present a jumping hidden Markov model that describes the generation of the viral quasispecies and a method to infer its parameters by analysing next generation sequencing data. The model introduces position-specific probability
more » ... over the sequence alphabet to explain the diversity that can be found in the population at each site. Recombination events are indicated by a change of state, allowing a single observed read to originate from multiple sequences. We present an implementation of the EM algorithm to find maximum likelihood estimates of the model parameters and a method to estimate the distribution of viral strains in the quasispecies. The model is validated on simulated data, showing the advantage of explicitly taking the recombination process into account, and applied to reads obtained from two experimental HIV samples.
doi:10.1007/978-3-642-29627-7_36 fatcat:e7idaemuy5elzgbxrambbshc5m