Inference with viral quasispecies diversity indices: clonal and NGS approaches

Josep Gregori, Miquel Salicrú, Esteban Domingo, Alex Sanchez, Juan I. Esteban, Francisco Rodríguez-Frías, Josep Quer
2014 Computer applications in the biosciences : CABIOS  
Given the inherent dynamics of a viral quasispecies we are often interested in the comparison of diversity indices of sequential samples of a patient, or in the comparison of diversity indices of virus in groups of patients in a treated versus control design. It is then important to make sure that the diversity measures from each sample may be compared with no bias and within a consistent statistical framework. In the present report we review some indices often used as measures for viral
more » ... ecies complexity and provide means for statistical inference, applying procedures taken from the ecology field. In particular we examine the Shannon entropy and the mutation frequency, and we discuss the appropriateness of different normalization methods of the Shannon entropy found in the literature. By taking amplicons ultra-deep pyrosequencing (UDPS) raw data as a surrogate of a real HCV viral population we study through in-silico sampling the statistical properties of these indices under two methods of viral quasispecies sampling, classical cloning followed by Sanger sequencing (CCSS) and Next Generation Sequencing (NGS) such as UDPS. We propose solutions specific to each of the two sampling methods -CCSS and NGS -to guarantee statistically conforming conclusions as free of bias as possible.
doi:10.1093/bioinformatics/btt768 pmid:24389655 fatcat:arpntvvm65cvxopkq3kulvr7va