Prediction of RNA Pseudoknotted Secondary Structure using Stochastic Context Free Grammars (SCFG)1

Rafael García
2006 CLEI Electronic Journal  
Pseudoknots are a frequent RNA structure that assumes essential roles for varied biocatalyst cell's functions. One of the most challenging fields in bioinformatics is the prediction of this secondary structure based on the base-pair sequence that dictates it. Previously, a model adapted from computational linguistics – Stochastic Context Free Grammars (SCFG) – has been used to predict RNA secondary structure. However, to this date the SCFG approach impose a prohibitive complexity cost [O(n4)]
more » ... en they are applied to the prediction of pseudoknots, mainly because a context-sensitive grammar is formally required to analyze them. Other hybrids approaches (energy maximization) give a O(n3)complexity in the best case, besides having several restrictions in the maximum length of the sequence for practical analysis. Here we introduce a novel algorithm, based on pattern matching techniques, that uses a sequential approximation strategy to solve the original problem. This algorithm not only reduces the complexity to O(n2logn), but also widens the maximum length of the sequence, as well as the capacity of analyzing several pseudoknots simultaneously.
doi:10.19153/cleiej.9.2.6 fatcat:y55icxerojg6lce7nrnzi5b3wa