High, clustered, nucleotide diversity in the genome of Anopheles gambiae revealed through pooled-template sequencing: implications for high-throughput genotyping protocols

Craig S Wilding, David Weetman, Keith Steen, Martin J Donnelly
2009 BMC Genomics  
Association mapping approaches are dependent upon discovery and validation of single nucleotide polymorphisms (SNPs). To further association studies in Anopheles gambiae we conducted a major resequencing programme, primarily targeting regions within or close to candidate genes for insecticide resistance. Results: Using two pools of mosquito template DNA we sequenced over 300 kbp across 660 distinct amplicons of the An. gambiae genome. Comparison of SNPs identified from pooled templates with
more » ... e from individual sequences revealed a very low false positive rate. False negative rates were much higher and mostly resulted from SNPs with a low minor allele frequency. Pooled-template sequencing also provided good estimates of SNP allele frequencies. Allele frequency estimation success, along with false positive and negative call rates, improved significantly when using a qualitative measure of SNP call quality. We identified a total of 7062 polymorphic features comprising 6995 SNPs and 67 indels, with, on average, a SNP every 34 bp; a high rate of polymorphism that is comparable to other studies of mosquitoes. SNPs were significantly more frequent in members of the cytochrome p450 mono-oxygenases and carboxy/cholinesterase genefamilies than in glutathione-S-transferases, other detoxification genes, and control genomic regions. Polymorphic sites showed a significantly clustered distribution, but the degree of SNP clustering (independent of SNP frequency) did not vary among gene families, suggesting that clustering of polymorphisms is a general property of the An. gambiae genome. Conclusion: The high frequency and clustering of SNPs has important ramifications for the design of high-throughput genotyping assays based on allele specific primer extension or probe hybridisation. We illustrate these issues in the context of the design of Illumina GoldenGate assays.
doi:10.1186/1471-2164-10-320 pmid:19607710 pmcid:PMC2723138 fatcat:mnnoeork6nhntizh3x6odjp5by