Computational analysis of mutation spectra

I. B. Rogozin
2003 Briefings in Bioinformatics  
Mutation frequencies vary along a nucleotide sequence, and nucleotide positions with an exceptionally high mutation frequency are called hotspots. Mutation hotspots in DNA often reflect intrinsic properties of the mutation process, such as the specificity with which mutagens interact with nucleic acids and the sequence-specificity of DNA repair/replication enzymes. They might also reflect structural and functional features of target protein or RNA sequences in which they occur. The determinants
more » ... of mutation frequency and specificity are complex and there are many analytical methods for their study. This paper discusses computational approaches to analysing mutation spectra (distribution of mutations along the target genes) that include many detectable (mutable) positions. The following methods are reviewed: mutation hotspot prediction; pairwise and multiple comparisons of mutation spectra; derivation of a consensus sequence; and analysis of correlation between nucleotide sequence features and mutation spectra. Spectra of spontaneous and induced mutations are used for illustration of the complexities and pitfalls of such analyses. In general, the DNA sequence context of mutation hotspots is a fingerprint of interactions between DNA and DNA repair/ replication/modification enzymes, and the analysis of hotspot context provides evidence of such interactions.
doi:10.1093/bib/4.3.210 pmid:14582516 fatcat:bqn7bdferzdihmbknphzlbcc7m