Secondary structure alone is generally not statistically significant for the detection of noncoding RNAs

E. Rivas, S. R. Eddy
2000 Bioinformatics  
Motivation: Several results in the literature suggest that biologically interesting RNAs have secondary structures that are more stable than expected by chance. Based on these observations, we developed a scanning algorithm for detecting noncoding RNA genes in genome sequences, using a fully probabilistic version of the Zuker minimumenergy folding algorithm. Results: Preliminary results were encouraging, but certain anomalies led us to do a carefully controlled investigation of this class of
more » ... hods. Ultimately, our results argue that for the probabilistic model there is indeed a statistical effect, but it comes mostly from local basecomposition bias and not from RNA secondary structure. For the thermodynamic implementation (which evaluates statistical significance by doing Monte Carlo shuffling in fixed-length sequence windows, thus eliminating the base-composition effect) the signals for noncoding RNAs are still usually indistinguishable from noise, especially when certain statistical artifacts resulting from local base-composition inhomogeneity are taken into account. We conclude that although a distinct, stable secondary structure is undoubtedly important in most noncoding RNAs, the stability of most noncoding RNA secondary structures is not sufficiently different from the predicted stability of a random sequence to be useful as a general genefinding approach.
doi:10.1093/bioinformatics/16.7.583 pmid:11038329 fatcat:ppdhf54obvf4lp2yweiwo7bq54