Using structural and evolutionary information to detect and correct pyrosequencing errors in non-coding RNAs [article]

Vladimir Reinharz, Yann Ponty , Jérôme Waldispühl
2013 arXiv   pre-print
Analysis of the sequence-structure relationship in RNA molecules are essential to evolutionary studies but also to concrete applications such as error-correction methodologies in sequencing technologies. The prohibitive sizes of the mutational and conformational landscapes combined with the volume of data to proceed require efficient algorithms to compute sequence-structure properties. More specifically, here we aim to calculate which mutations increase the most the likelihood of a sequence to
more » ... given structure and RNA family. In this paper, we introduce RNApyro, an efficient linear-time and space inside-outside algorithm that computes exact mutational probabilities under secondary structure and evolutionary constraints given as a multiple sequence alignment with a consensus structure. We develop a scoring scheme combining classical stacking base pair energies to novel isostericity scales, and apply our techniques to correct point-wise errors in 5s and 16s rRNA sequences. Our results suggest that RNApyro is a promising algorithm to complement existing tools in the NGS error-correction pipeline.
arXiv:1305.7068v1 fatcat:hr6xeqwvvbbi5gxlq7suunngve