101 Hits in 1.4 sec

Supporting the Curation of Biological Databases with Reusable Text Mining

Olivo Miotto, Tin Wee Tan, Vladimir Brusic
2005 Genome Informatics Series  
Curators of biological databases transfer knowledge from scientific publications, a laborious and expensive manual process. Machine learning algorithms can reduce the workload of curators by filtering relevant biomedical literature, though their widespread adoption will depend on the availability of intuitive tools that can be configured for a variety of tasks. We propose a new method for supporting curators by means of document categorization, and describe the architecture of a
more » ... tool implementing this method using techniques that require no computational linguistic or programming expertise. To demonstrate the feasibility of this approach, we prototyped an application of this method to support a real curation task: identifying PubMed abstracts that contain allergen cross-reactivity information. We tested the performance of two different classifier algorithms (CART and ANN), applied to both composite and single-word features, using several feature scoring functions. Both classifiers exceeded our performance targets, the ANN classifier yielding the best results. These results show that the method we propose can deliver the level of performance needed to assist database curation.
doi:10.11234/gi1990.16.2_32 fatcat:ptmmatc3bfdbzgcpc47y3i2sze

DiMA: Sequence Diversity Dynamics Analyser for Viruses [article]

Shan Tharanga, Yongli Hu, Eyyub Selim Unlu, Muhammad Farhan Sjaugi, Muhammet A. Celik, Hilal Hekimoglu, Olivo Miotto, Muhammed Miran Oncel, Asif M. Khan
2022 arXiv   pre-print
Sequence diversity is one of the major challenges in the design of diagnostic, prophylactic and therapeutic interventions against viruses. Herein, we present DiMA, a tool designed to facilitate the dissection of sequence diversity dynamics for viruses. As a base, DiMA provides a quantitative overview of sequence diversity by use of Shannon's entropy, applied via a user-defined k-mer sliding window to an input alignment file. Distinctively, the key feature is that DiMA interrogates diversity
more » ... mics by dissecting each k-mer position to various diversity motifs, defined based on the incidence of distinct sequences. At a given position, an index is a predominant sequence, while all the others are (total) variants to the index, sub-classified into the major (most common) variant, minor variants (occurring more than once and of frequency lower than the major), and the unique (singleton) variants. Moreover, DiMA allows for metadata enrichment of the motifs. DiMA is big data ready and provides an interactive output, depicting multiple facets of sequence diversity, with download options. It enables comparative genome/proteome diversity dynamics analyses, within and between sequences of viral species. The web server is publicly available at
arXiv:2205.13915v1 fatcat:2skoi6f2sfh3pkscgtg2og62qy

Rule-based knowledge aggregation for large-scale protein sequence analysis of influenza A viruses

Olivo Miotto, Tin Tan, Vladimir Brusic
2008 BMC Bioinformatics  
The explosive growth of biological data provides opportunities for new statistical and comparative analyses of large information sets, such as alignments comprising tens of thousands of sequences. In such studies, sequence annotations frequently play an essential role, and reliable results depend on metadata quality. However, the semantic heterogeneity and annotation inconsistencies in biological databases greatly increase the complexity of aggregating and cleaning metadata. Manual curation of
more » ... atasets, traditionally favoured by life scientists, is impractical for studies involving thousands of records. In this study, we investigate quality issues that affect major public databases, and quantify the effectiveness of an automated metadata extraction approach that combines structural and semantic rules. We applied this approach to more than 90,000 influenza A records, to annotate sequences with protein name, virus subtype, isolate, host, geographic origin, and year of isolation. Results: Over 40,000 annotated Influenza A protein sequences were collected by combining information from more than 90,000 documents from NCBI public databases. Metadata values were automatically extracted, aggregated and reconciled from several document fields by applying user-defined structural rules. For each property, values were recovered from ≥88.8% of records, with accuracy exceeding 96% in most cases. Because of semantic heterogeneity, each property required up to six different structural rules to be combined. Significant quality differences between databases were found: GenBank documents yield values more reliably than documents extracted from GenPept. Using a simple set of semantic rules and a reasoner, we reconstructed relationships between sequences from the same isolate, thus identifying 7640 isolates. Validation of isolate metadata against a simple ontology highlighted more than 400 inconsistencies, leading to over 3,000 property value corrections. Conclusion: To overcome the quality issues inherent in public databases, automated knowledge aggregation with embedded intelligence is needed for large-scale analyses. Our results show that user-controlled intuitive approaches, based on combination of simple rules, can reliably automate various curation tasks, reducing the need for manual corrections to approximately 5% of the records. Emerging semantic technologies possess desirable features to support today's knowledge aggregation tasks, with a potential to bring immediate benefits to this field.
doi:10.1186/1471-2105-9-s1-s7 pmid:18315860 pmcid:PMC2259408 fatcat:4kuzd5yranbf3fkkro6m6g65by

Conservation and Variability of West Nile Virus Proteins

Qi Ying Koo, Asif M. Khan, Keun-Ok Jung, Shweta Ramdas, Olivo Miotto, Tin Wee Tan, Vladimir Brusic, Jerome Salmon, J. Thomas August, Darren P. Martin
2009 PLoS ONE  
West Nile virus (WNV) has emerged globally as an increasingly important pathogen for humans and domestic animals. Studies of the evolutionary diversity of the virus over its known history will help to elucidate conserved sites, and characterize their correspondence to other pathogens and their relevance to the immune system. We describe a large-scale analysis of the entire WNV proteome, aimed at identifying and characterizing evolutionarily conserved amino acid sequences. This study, which used
more » ... 2,746 WNV protein sequences collected from the NCBI GenPept database, focused on analysis of peptides of length 9 amino acids or more, which are immunologically relevant as potential T-cell epitopes. Entropy-based analysis of the diversity of WNV sequences, revealed the presence of numerous evolutionarily stable nonamer positions across the proteome (entropy value of #1). The representation (frequency) of nonamers variant to the predominant peptide at these stable positions was, generally, low (#10% of the WNV sequences analyzed). Eighty-eight fragments of length 9-29 amino acids, representing ,34% of the WNV polyprotein length, were identified to be identical and evolutionarily stable in all analyzed WNV sequences. Of the 88 completely conserved sequences, 67 are also present in other flaviviruses, and several have been associated with the functional and structural properties of viral proteins. Immunoinformatic analysis revealed that the majority (78/88) of conserved sequences are potentially immunogenic, while 44 contained experimentally confirmed human T-cell epitopes. This study identified a comprehensive catalogue of completely conserved WNV sequences, many of which are shared by other flaviviruses, and majority are potential epitopes. The complete conservation of these immunologically relevant sequences through the entire recorded WNV history suggests they will be valuable as components of peptide-specific vaccines or other therapeutic applications, for sequence-specific diagnosis of a wide-range of Flavivivirus infections, and for studies of homologous sequences among other flaviviruses.
doi:10.1371/journal.pone.0005352 pmid:19401763 pmcid:PMC2670515 fatcat:knjgcvi2qvgjfnbfjywi6geuye

Analysis of viral diversity for vaccine target discovery

Asif M. Khan, Yongli Hu, Olivo Miotto, Natascha M. Thevasagayam, Rashmi Sukumaran, Hadia Syahirah Abd Raman, Vladimir Brusic, Tin Wee Tan, J. Thomas August
2017 BMC Medical Genomics  
Viral vaccine target discovery requires understanding the diversity of both the virus and the human immune system. The readily available and rapidly growing pool of viral sequence data in the public domain enable the identification and characterization of immune targets relevant to adaptive immunity. A systematic bioinformatics approach is necessary to facilitate the analysis of such large datasets for selection of potential candidate vaccine targets. Results: This work describes a
more » ... methodology to achieve this analysis, with data of dengue, West Nile, hepatitis A, HIV-1, and influenza A viruses as examples. Our methodology has been implemented as an analytical pipeline that brings significant advancement to the field of reverse vaccinology, enabling systematic screening of known sequence data in nature for identification of vaccine targets. This includes key steps (i) comprehensive and extensive collection of sequence data of viral proteomes (the virome), (ii) data cleaning, (iii) large-scale sequence alignments, (iv) peptide entropy analysis, (v) intra-and inter-species variation analysis of conserved sequences, including human homology analysis, and (vi) functional and immunological relevance analysis. Conclusion: These steps are combined into the pipeline ensuring that a more refined process, as compared to a simple evolutionary conservation analysis, will facilitate a better selection of vaccine targets and their prioritization for subsequent experimental validation.
doi:10.1186/s12920-017-0301-2 pmid:29322922 pmcid:PMC5763473 fatcat:7su4xz5aqrcflpbvorswaitn6a

Oxidative stress and protein damage responses mediate artemisinin resistance in malaria parasites

Frances Rocamora, Lei Zhu, Kek Yee Liong, Arjen Dondorp, Olivo Miotto, Sachel Mok, Zbynek Bozdech, Roland Cooper
2018 PLoS Pathogens  
Takala-Harrison S, Clark TG, Jacob CG, Cummings MP, Miotto O, et al. (2013) Genetic loci associated with delayed clearance of Plasmodium falciparum following artemisinin treatment in Southeast Asia.  ... 
doi:10.1371/journal.ppat.1006930 pmid:29538461 pmcid:PMC5868857 fatcat:qzsrc3cppjb6nadt3u2xticj5e

Plasmodium falciparum Founder Populations in Western Cambodia Have Reduced Artemisinin SensitivityIn Vitro

Chanaki Amaratunga, Benoit Witkowski, Dalin Dek, Vorleak Try, Nimol Khim, Olivo Miotto, Didier Ménard, Rick M. Fairhurst
2014 Antimicrobial Agents and Chemotherapy  
Miotto et al., submitted for publication). KH-U, which appears genetically admixed, shows a wide range of half-life values and cannot be reliably classified as fast clearing or slow clearing.  ...  Miotto et al., submitted for publication), we reassigned all 44 isolates to a core subpopulation (KH-C, n ϭ 6), one of three western Cambodian founder populations (WKH-F01, n ϭ 5; WKH-F02, n ϭ 3; WKH-F04  ... 
doi:10.1128/aac.03055-14 pmid:24867977 pmcid:PMC4136061 fatcat:bqqdyvrsn5fytkvca4edlegcde

Identification of human-to-human transmissibility factors in PB2 proteins of influenza A by large-scale mutual information analysis

Olivo Miotto, AT Heiny, Tin Tan, J Thomas August, Vladimir Brusic
2008 BMC Bioinformatics  
The identification of mutations that confer unique properties to a pathogen, such as host range, is of fundamental importance in the fight against disease. This paper describes a novel method for identifying amino acid sites that distinguish specific sets of protein sequences, by comparative analysis of matched alignments. The use of mutual information to identify distinctive residues responsible for functional variants makes this approach highly suitable for analyzing large sets of sequences.
more » ... o support mutual information analysis, we developed the AVANA software, which utilizes sequence annotations to select sets for comparison, according to user-specified criteria. The method presented was applied to an analysis of influenza A PB2 protein sequences, with the objective of identifying the components of adaptation to human-to-human transmission, and reconstructing the mutation history of these components. Results: We compared over 3,000 PB2 protein sequences of human-transmissible and avian isolates, to produce a catalogue of sites involved in adaptation to human-to-human transmission. This analysis identified 17 characteristic sites, five of which have been present in human-transmissible strains since the 1918 Spanish flu pandemic. Sixteen of these sites are located in functional domains, suggesting they may play functional roles in host-range specificity. The catalogue of characteristic sites was used to derive sequence signatures from historical isolates. These signatures, arranged in chronological order, reveal an evolutionary timeline for the adaptation of the PB2 protein to human hosts. Conclusion: By providing the most complete elucidation to date of the functional components participating in PB2 protein adaptation to humans, this study demonstrates that mutual information is a powerful tool for comparative characterization of sequence sets. In addition to confirming previously reported findings, several novel characteristic sites within PB2 are reported. Sequence signatures generated using the characteristic sites catalogue characterize concisely the adaptation characteristics of individual isolates. Evolutionary timelines derived from signatures of early human influenza isolates suggest that characteristic variants emerged rapidly, and remained remarkably stable through subsequent pandemics. In addition, the signatures of human-infecting H5N1 isolates suggest that this avian subtype has low pandemic potential at present, although it presents more human adaptation components than most avian subtypes.
doi:10.1186/1471-2105-9-s1-s18 pmid:18315849 pmcid:PMC2259419 fatcat:cvvp2vos4feyhj3fd5p5jyrw6q

A systematic bioinformatics approach for selection of epitope-based vaccine targets

Asif M. Khan, Olivo Miotto, A.T. Heiny, Jerome Salmon, K.N. Srinivasan, Eduardo J.M. Nascimento, Ernesto T.A. Marques, Vladimir Brusic, Tin Wee Tan, J. Thomas August
2006 Cellular Immunology  
Epitope-based vaccines provide a new strategy for prophylactic and therapeutic application of pathogen-specific immunity. A critical requirement of this strategy is the identification and selection of T-cell epitopes that act as vaccine targets. This study describes current methodologies for the selection process, with dengue virus as a model system. A combination of publicly available bioinformatics algorithms and computational tools are used to screen and select antigen sequences as potential
more » ... T-cell epitopes of supertype HLA alleles. The selected sequences are tested for biological function by their activation of T-cells of HLA transgenic mice and of pathogen infected subjects. This approach provides an experimental basis for the design of pathogen specific, T-cell epitopebased vaccines that are targeted to majority of the genetic variants of the pathogen, and are effective for a broad range of differences in human leukocyte antigens among the global human population.
doi:10.1016/j.cellimm.2007.02.005 pmid:17434154 pmcid:PMC2041846 fatcat:minkmdxdibcfblutnyj4fi3eaa

Complete-Proteome Mapping of Human Influenza A Adaptive Mutations: Implications for Human Transmissibility of Zoonotic Strains

Olivo Miotto, A. T. Heiny, Randy Albrecht, Adolfo García-Sastre, Tin Wee Tan, J. Thomas August, Vladimir Brusic, Art F. Y. Poon
2010 PLoS ONE  
There is widespread concern that H5N1 avian influenza A viruses will emerge as a pandemic threat, if they become capable of human-to-human (H2H) transmission. Avian strains lack this capability, which suggests that it requires important adaptive mutations. We performed a large-scale comparative analysis of proteins from avian and human strains, to produce a catalogue of mutations associated with H2H transmissibility, and to detect their presence in avian isolates. Methodology/Principal
more » ... We constructed a dataset of influenza A protein sequences from 92,343 public database records. Human and avian sequence subsets were compared, using a method based on mutual information, to identify characteristic sites where human isolates present conserved mutations. The resulting catalogue comprises 68 characteristic sites in eight internal proteins. Subtype variability prevented the identification of adaptive mutations in the hemagglutinin and neuraminidase proteins. The high number of sites in the ribonucleoprotein complex suggests interdependence between mutations in multiple proteins. Characteristic sites are often clustered within known functional regions, suggesting their functional roles in cellular processes. By isolating and concatenating characteristic site residues, we defined adaptation signatures, which summarize the adaptive potential of specific isolates. Most adaptive mutations emerged within three decades after the 1918 pandemic, and have remained remarkably stable thereafter. Two lineages with stable internal protein constellations have circulated among humans without reassorting. On the contrary, H5N1 avian and swine viruses reassort frequently, causing both gains and losses of adaptive mutations. Conclusions: Human host adaptation appears to be complex and systemic, involving nearly all influenza proteins. Adaptation signatures suggest that the ability of H5N1 strains to infect humans is related to the presence of an unusually high number of adaptive mutations. However, these mutations appear unstable, suggesting low pandemic potential of H5N1 in its current form. In addition, adaptation signatures indicate that pandemic H1N1/09 strain possesses multiple human-transmissibility mutations, though not an unusually high number with respect to swine strains that infected humans in the past. Adaptation signatures provide a novel tool for identifying zoonotic strains with the potential to infect humans.
doi:10.1371/journal.pone.0009025 pmid:20140252 pmcid:PMC2815782 fatcat:bwhr6n25ujgadoydf2dmzyyo5u

Origins of the current outbreak of multidrug resistant malaria in Southeast Asia: a retrospective genetic study [article]

Roberto Amato, Richard D. Pearson, Jacob Almagro-Garcia, Chanaki Amaratunga, Pharath Lim, Seila Suon, Sokunthea Sreng, Eleanor Drury, Jim Stalker, Olivo Miotto, Rick M. Fairhurst, Dominic P. Kwiatkowski
2017 bioRxiv   pre-print
Antimalarial failure is rapidly spreading across parts of Southeast Asia where dihydroartemisinin-piperaquine (DHA-PPQ) is used as first line treatment. The first published reports came from western Cambodia in 2013. Here we analyse genetic changes in the Plasmodium falciparum population of western Cambodia in the six years prior to that. Methods: We analysed genome sequence data on 1492 P. falciparum samples from Southeast Asia, including 464 collected in western Cambodia between 2007 and
more » ... Different epidemiological origins of resistance were identified by haplotypic analysis of the kelch13 artemisinin resistance locus and the plasmepsin 2-3 piperaquine resistance locus. Findings: We identified over 30 independent origins of artemisinin resistance, of which the KEL1 lineage accounted for 91% of DHA-PPQ-resistant parasites. In 2008, KEL1 combined with PLA1, the major lineage associated with piperaquine resistance. By 2012, the KEL1/PLA1 co-lineage had reached over 60% frequency in western Cambodia and had spread to northern Cambodia. Interpretation: The KEL1/PLA1 co-lineage emerged in the same year that DHA-PPQ became the first line antimalarial drug in western Cambodia and spread aggressively thereafter, displacing other artemisinin-resistant parasite lineages. These findings have significant implications for management of the global health risk associated with the current outbreak.
doi:10.1101/208371 fatcat:t7376uv4pnaxbd3uxiduhyhcfa

Fitness Loss under Amino Acid Starvation in Artemisinin-Resistant Plasmodium falciparum Isolates from Cambodia

Duangkamon Bunditvorapoom, Theerarat Kochakarn, Namfon Kotanan, Charin Modchang, Krittikorn Kümpornsin, Duangkamon Loesbanluechai, Thanyaluk Krasae, Liwang Cui, Kesinee Chotivanich, Nicholas J. White, Prapon Wilairat, Olivo Miotto (+1 others)
2018 Scientific Reports  
Artemisinin is the most rapidly effective drug for Plasmodium falciparum malaria treatment currently in clinical use. Emerging artemisinin-resistant parasites pose a great global health risk. At present, the level of artemisinin resistance is still relatively low with evidence pointing towards a trade-off between artemisinin resistance and fitness loss. Here we show that artemisinin-resistant P. falciparum isolates from Cambodia manifested fitness loss, showing fewer progenies during the
more » ... rythrocytic developmental cycle. The loss in fitness was exacerbated under the condition of low exogenous amino acid supply. The resistant parasites failed to undergo maturation, whereas their drug-sensitive counterparts were able to complete the erythrocytic cycle under conditions of amino acid deprivation. The artemisinin-resistant phenotype was not stable, and loss of the phenotype was associated with changes in the expression of a putative target, Exp1, a membrane glutathione transferase. Analysis of SNPs in haemoglobin processing genes revealed associations with parasite clearance times, suggesting changes in haemoglobin catabolism may contribute to artemisinin resistance. These findings on fitness and protein homeostasis could provide clues on how to contain emerging artemisinin-resistant parasites. Artemisinin and its derivatives have saved millions of malaria patients' lives by their rapidity of action 1 . Artemisinin and its derivatives are the only drugs in clinical use that can kill every intra-erythrocytic stage of human malaria parasite Plasmodium falciparum 1 . Global campaigns have been launched to prevent artemisinin resistance by administering artemisinin only as combination therapies and monitoring artemisinin sensitivity by measuring parasite clearance times at key sentinel sites 2 . Despite ongoing efforts, P. falciparum infections with delayed parasite clearance following artemisinin treatment began to emerge in Cambodia and, after ten years, have become prevalent throughout the Greater Mekong subregion 3,4 . Even though the current artemisinin
doi:10.1038/s41598-018-30593-5 pmid:30135481 pmcid:PMC6105667 fatcat:sk3ihu4x3zc77ixp5r2v53szkm

Evolutionarily Conserved Protein Sequences of Influenza A Viruses, Avian and Human, as Vaccine Targets

A. T. Heiny, Olivo Miotto, Kellathur N. Srinivasan, Asif M. Khan, G. L. Zhang, Vladimir Brusic, Tin Wee Tan, J. Thomas August, Berend Snel
2007 PLoS ONE  
Background. Influenza A viruses generate an extreme genetic diversity through point mutation and gene segment exchange, resulting in many new strains that emerge from the animal reservoirs, among which was the recent highly pathogenic H5N1 virus. This genetic diversity also endows these viruses with a dynamic adaptability to their habitats, one result being the rapid selection of genomic variants that resist the immune responses of infected hosts. With the possibility of an influenza A
more » ... a critical need is a vaccine that will recognize and protect against any influenza A pathogen. One feasible approach is a vaccine containing conserved immunogenic protein sequences that represent the genotypic diversity of all current and future avian and human influenza viruses as an alternative to current vaccines that address only the known circulating virus strains. Methodology/Principal Findings. Methodologies for large-scale analysis of the evolutionary variability of the influenza A virus proteins recorded in public databases were developed and used to elucidate the amino acid sequence diversity and conservation of 36,343 sequences of the 11 viral proteins of the recorded virus isolates of the past 30 years. Technologies were also applied to identify the conserved amino acid sequences from isolates of the past decade, and to evaluate the predicted human lymphocyte antigen (HLA) supertype-restricted class I and II T-cell epitopes of the conserved sequences. Fifty-five (55) sequences of 9 or more amino acids of the polymerases (PB2, PB1, and PA), nucleoprotein (NP), and matrix 1 (M1) proteins were completely conserved in at least 80%, many in 95 to 100%, of the avian and human influenza A virus isolates despite the marked evolutionary variability of the viruses. Almost all (50) of these conserved sequences contained putative supertype HLA class I or class II epitopes as predicted by 4 peptide-HLA binding algorithms. Additionally, data of the Immune Epitope Database (IEDB) include 29 experimentally identified HLA class I and II T-cell epitopes present in 14 of the conserved sequences. Conclusions/Significance. This study of all reported influenza A virus protein sequences, avian and human, has identified 55 highly conserved sequences, most of which are predicted to have immune relevance as Tcell epitopes. This is a necessary first step in the design and analysis of a polyepitope, pan-influenza A vaccine. In addition to the application described herein, these technologies can be applied to other pathogens and to other therapeutic modalities designed to attack DNA, RNA, or protein sequences critical to pathogen function.
doi:10.1371/journal.pone.0001190 pmid:18030326 pmcid:PMC2065905 fatcat:2kbl5wgn5bddranhenhlfbjnhi

Modulation of triple artemisinin-based combination therapy pharmacodynamics by Plasmodium falciparum genotype [article]

Megan R. Ansbro, Zina Itkin, Lu Chen, Gergely Zahoranszky-Kohalmi, Chanaki Amaratunga, Olivo Miotto, Tyler Peryea, Charlotte V. Hobbs, Seila Suon, Juliana M. S, Arjen M Dondorp, Rob W. van der Pluijm (+3 others)
2020 bioRxiv   pre-print
The first-line treatments for uncomplicated Plasmodium falciparum malaria are artemisinin-based combination therapies (ACTs), consisting of an artemisinin derivative combined with a longer acting partner drug. However, the spread of P. falciparum with decreased susceptibility to artemisinin and partner drugs presents a significant challenge to malaria control efforts. To stem the spread of drug resistant parasites, novel chemotherapeutic strategies are being evaluated, including the
more » ... on of triple artemisinin-based combination therapies (TACTs). Currently, there is limited knowledge on the pharmacodynamics and pharmacogenetic interactions of proposed TACT drug combinations. To evaluate these interactions, we established an in vitro high-throughput process for measuring the drug dose-response to three distinct antimalarial drugs present in a TACT. Sixteen different TACT combinations were screened against fifteen parasite lines from Cambodia, with a focus on parasites with differential susceptibilities to piperaquine and artemisinins. Analysis revealed drug-drug interactions unique to specific genetic backgrounds, including antagonism between piperaquine and pyronaridine associated with gene amplification of plasmepsin II/III, two aspartic proteases that localize to the parasite digestive vacuole. From this initial study, we identified parasite genotypes with decreased susceptibility to specific TACTs, as well as potential TACTs that display antagonism in a genotype-dependent manner. Our assay and analysis platform can be further leveraged to inform drug implementation decisions and evaluate next-generation TACTs.
doi:10.1101/2020.07.03.187039 fatcat:oduihunwgfcgfa62sgyrlzpdpy

Comparative genome-wide analysis and evolutionary history of haemoglobin-processing and haem detoxification enzymes in malarial parasites

Patrath Ponsuwanna, Theerarat Kochakarn, Duangkamon Bunditvorapoom, Krittikorn Kümpornsin, Thomas D. Otto, Chase Ridenour, Kesinee Chotivanich, Prapon Wilairat, Nicholas J. White, Olivo Miotto, Thanat Chookajorn
2016 Malaria Journal  
Malaria parasites have evolved a series of intricate mechanisms to survive and propagate within host red blood cells. Intra-erythrocytic parasitism requires these organisms to digest haemoglobin and detoxify ironbound haem. These tasks are executed by haemoglobin-specific proteases and haem biocrystallization factors that are components of a large multi-subunit complex. Since haemoglobin processing machineries are functionally and genetically linked to the modes of action and resistance
more » ... ms of several anti-malarial drugs, an understanding of their evolutionary history is important for drug development and drug resistance prevention. Methods: Maximum likelihood trees of genetic repertoires encoding haemoglobin processing machineries within Plasmodium species, and with the representatives of Apicomplexan species with various host tropisms, were created. Genetic variants were mapped onto existing three-dimensional structures. Genome-wide single nucleotide polymorphism data were used to analyse the selective pressure and the effect of these mutations at the structural level. Results: Recent expansions in the falcipain and plasmepsin repertoires are unique to human malaria parasites especially in the Plasmodium falciparum and P. reichenowi lineage. Expansion of haemoglobin-specific plasmepsins occurred after the separation event of Plasmodium species, but the other members of the plasmepsin family were evolutionarily conserved with one copy for each sub-group in every Apicomplexan species. Haemoglobin-specific falcipains are separated from invasion-related falcipain, and their expansions within one specific locus arose independently in both P. falciparum and P. vivax lineages. Gene conversion between P. falciparum falcipain 2A and 2B was observed in artemisinin-resistant strains. Comparison between the numbers of non-synonymous and synonymous mutations suggests a strong selective pressure at falcipain and plasmepsin genes. The locations of amino acid changes from non-synonymous mutations mapped onto protein structures revealed clusters of amino acid residues in close proximity or near the active sites of proteases.
doi:10.1186/s12936-016-1097-9 pmid:26821618 pmcid:PMC4731938 fatcat:ostyy5ydmbd7ldnvhfaz56txmq
« Previous Showing results 1 — 15 out of 101 results