Conserved associations between G-quadruplex-forming DNA motifs and virulence gene families in malaria parasites [post]

2020 unpublished
The Plasmodium genus of malaria parasites encodes several families of antigenencoding genes. These genes tend to be hyper-variable, highly recombinogenic and variantly expressed. The best-characterized family is the var genes, exclusively found in the Laveranian subgenus of malaria parasites infecting humans and great apes. Var genes encode major virulence factors involved in immune evasion and the maintenance of chronic infections. In the human parasite P. falciparum, var gene recombination
more » ... ne recombination and diversification appear to be promoted by G-quadruplex (G4) DNA motifs, which are strongly associated with var genes in P. falciparum. Here, we investigated how this association might have evolved across Plasmodium species -both Laverania and also more distantly related species which lack vars but encode other, more ancient variant gene families. Results: The association between var genes and G4-forming motifs was conserved across Laverania, spanning ~1 million years of evolutionary time, with suggestive evidence for evolution of the association occurring within this subgenus. In rodent malaria species, G4-forming motifs were somewhat associated with pir genes, but this was not conserved in the Laverania, nor did we find a strong association of these motifs with any gene family in a second outgroup of avian malaria parasites. Secondly, we compared two different G4 prediction algorithms in their performance on extremely A/T-rich Plasmodium genomes, and also compared these predictions with experimental data from G4-seq, a DNA sequencing method for identifying G4-forming motifs. We found a surprising lack of concordance between the two algorithms and also between the algorithms and G4-seq data. Conclusions: G4-forming motifs are uniquely strongly associated with Plasmodium var genes, suggesting a particular role for G4s in recombination and diversification of these genes. Secondly, in the A/T-rich genomes of Plasmodium species, the choice of prediction algorithm may be particularly influential when studying G4s in these important protozoan pathogens. Background Malaria is caused by protozoan Plasmodium parasites: in humans it causes considerable morbidity and is still responsible for almost half a million deaths each year [1]. Most severe cases of human malaria are caused by Plasmodium falciparum, but a further five parasite species can infect humans 3 and there are many more species that infect rodents, birds and other vertebrates. In all vertebrate hosts, the disease involves cyclical infection of erythrocytes by Plasmodium parasites. The infected cells are exposed to circulating immune factors and also to splenic clearance, yet many malarias can lead to both chronic and repeated infections, indicating that the parasites have considerable capacities for immune evasion. These capacities have been linked, in several species, to the variant expression of virulence factors that are exposed on the infected erythrocyte surface and are encoded by highly variable families of variantly-expressed virulence genes. Such gene families are best-characterized in P. falciparum, where the var gene family encodes 60 variants of P. falciparum Erythrocyte Membrane Protein 1 (PfEMP1) [2] [3] [4] , an adhesin which allows infected erythrocytes to adhere to the vascular endothelium and avoid splenic clearance. PfEMP1 proteins are critical virulence factors and they also contribute to disease, exacerbating vascular occlusion, hypoxia and lethal syndromes such as cerebral malaria. Var genes are restricted to the Laveranian subgenus of Plasmodium, which infects great apes and includes the humaninfecting P. falciparum species [5], but other Plasmodium species encode other gene families that may have similar roles, including the sicavar family in the macaque parasite P. knowlesi [6, 7] and the pir family which appears widely in many species from rodent to human malarias [8, 9] (Fig. 1 ). Effective antigenic variation over long periods of time requires highly regulated and mutually exclusive gene expression, and this has indeed been well characterized in the case of the var genes [10] . Other families, including sicavar [11] and pir [12], appear to have similar dynamics, albeit with less tight mutually-exclusive regulation. In the case of P. falciparum, var genes are regulated by epigenetic silencing and expression switching [10], and additional antigenic diversity is generated within the ~60-member gene family via frequent recombination during both mitosis and meiosis [13, 14] . Thus, var gene regulation is a key virulence mechanism for the maintenance of chronic infections caused by P. falciparum: understanding this at the molecular level is an area of great interest in malaria biology. A decade ago, it was first observed that many var genes are associated with DNA motifs that could form G-quadruplexes (G4s) [15] . G4s are DNA or RNA secondary structures of the general form
doi:10.21203/rs.2.17995/v2 fatcat:ue5c7ymmqzbpvbfpeqnyijzqge