Rapid evolution of antiviral APOBEC3 genes driven by the conflicts with ancient retroviruses [article]

Jumpei Ito, Robert J Gifford, Kei Sato
2019 bioRxiv   pre-print
The evolution of antiviral genes has been fundamentally shaped by antagonistic interactions with ancestral viruses. The AID/APOBEC family genes (AID and APOBEC1-4) encode cellular cytosine deaminases that target nucleic acids and catalyze C-to-U mutations. In the case of retroviral replication, APOBEC3 proteins induce C-to-U mutations in minus-stranded viral DNA, which results in G-to-A mutations in the viral genome. Previous studies have indicated that the expansion and rapid evolution of
more » ... lian APOBEC3 genes has been driven by an arms race with retroviral parasites, but this has not been thoroughly investigated. Endogenous retroviruses (ERVs) are retrotransposons originated from ancient retroviral infections. These sequences sometimes bear the hallmarks of APOBEC3-mediated mutations, and therefore serve as a record of the ancient conflict between retroviruses and APOBEC3 genes. Here we systematically investigated the sequences of ERVs and APOBEC3 genes in mammals to reconstruct details of the evolutionary conflict between them. We identified 1,420 AID/APOBEC family genes in a comprehensive screen of mammalian genome. Of the AID/APOBEC family genes, APOBEC3 genes have been selectively amplified in mammalian genomes and disclose evidence of strong positive selection - whereas the catalytic domain was highly conserved across species, the structure loop 7, which recognizes viral DNA/RNA substrates, was shown to be evolving under strong positive selection. Although APOBEC3 genes have been amplified by tandem gene duplication in most mammalian lineages, the retrotransposition-mediated gene amplification was found in several mammals including New World monkeys and prosimian primates. Comparative analysis revealed that G-to-A mutations are accumulated in ERVs, and that the G-to-A mutation signatures on ERVs is concordant with the target preferences of APOBEC3 proteins. Importantly, the number of APOBEC3 genes was significantly correlated with the frequency of G-to-A mutations in ERVs, suggesting that the amplification of APOBEC3 genes led to stronger attacks on ERVs and/or their ancestral retroviruses by APOBEC3 proteins. Furthermore, the numbers of APOBEC3 genes and ERVs in mammalian genomes were positively correlated, and in primates, the timings of APOBEC3 gene amplification was concordant with that of ERV invasions. Our findings suggest that conflict with ancient retroviruses was a major selective pressure driving the rapid evolution of APOBEC3 genes in mammals.
doi:10.1101/707190 fatcat:n2zpmm4davgz5dfdrv54n3abfm