1 Optimization of REDItools Package for investigating RNA Editing in Thousands of human deep sequencing Experiments
RNA editing is a widespread post-transcriptional mechanism that alters primary RNA sequences through the insertion/ deletion or modification of specific nucleotides. In humans, RNA editing affects nuclear and cytoplasmic transcripts mainly by the deamination of adenosine (A) to inosine (I) through members of ADAR enzymes. A-to-I modifications increase transcriptome and proteome diversity, and contribute in modulating gene expression at RNA level. RNA editing by A-to-I change is prominent in
... coding regions containing Alu repetitive elements, whereas the list of ADAR substrates in protein coding genes is relatively small. RNA editing modifies several human neurotransmitter receptors and plays important roles in modulating their physiology. Indeed, its deregulation has been linked to a variety of human diseases, including neurological and neurodegenerative disorders, as well as cancer. Current technologies for massive transcriptome sequencing, such as RNASeq, are providing accurate maps of transcriptional dynamics occurring in complex eukaryotic genomes, as the human one, and are facilitating the detection of post-transcriptional RNA editing modifications with unprecedented resolution. However, the computational detection of RNA editing events in RNAseq experiments is quite intensive, requiring the browsing of the human genome, position by position. To investigate RNA editing in very large cohort of RNAseq data, we have developed a novel algorithm called REDItools2.0. Here, we describe the core algorithm as well as optimization strategies used to efficiently analyze RNA editing in HPC systems.